Bytes for Data Engineers
and Buffers ... and Streams.
One of the most underutilized pieces of code I’ve seen in all my many years of Data Engineering is Bytes and Streams. I’m not sure why. It just never appears.
I see Strings, Ints, Floats, I see everything, but never a plain old Byte. Poor little bugger. I don’t know if people think it’s too complicated, in fact, it is less, less to go wrong, less complexity.
What’s more computationally expensive than Serialization and Deserialization? Especially in a Data Engineering context. Lots of data moving around, coming from this place and going to that place. Does it really need to be a String all the time? No.
Let’s take a look at Bytes, Streams, and Buffers in Python and Rust.
Thanks to Delta for sponsoring this newsletter! I personally use Delta Lake on a daily basis, and I believe this technology represents the future of Data Engineering. Check out their website below.
Keep reading with a 7-day free trial
Subscribe to Data Engineering Central to keep reading this post and get 7 days of free access to the full post archives.




