Hi! Caveat: I work at Prefect. I used to be a data engineer and worked with cloud functions a lot like Lambdas.
I wrote this article recently about leveraging Prefect's event system and hosted webhooks to monitor and alert on Lambdas not firing. I thought you might find this interesting.
I don't think data teams will move in that direction because of the skills and mental models they bring. As you said most are not software engineers and embrace managed services which get better by the day. Python will remain the high level interface. Most DEngs I know want to get things done and not think too deeply about implementation details. Obviously it's still great to have the flexibility to choose a different model. Maybe when they get older and bored but by then they may be managers or accepted their fate.
this can be a challenge or opportunity for any SE-based to move into DE. I think it would vital added-value for any DE to have such SE-based implementation knowledge in the field.
Very interesting, But how about Observability / Monotoring?
In the case of you example adding the change in the existing Batch infra allow easy monitoring without changing much.
How do you think we can still have centralized observability & monitoring with Microservices in DE contexts.
Hi! Caveat: I work at Prefect. I used to be a data engineer and worked with cloud functions a lot like Lambdas.
I wrote this article recently about leveraging Prefect's event system and hosted webhooks to monitor and alert on Lambdas not firing. I thought you might find this interesting.
https://www.prefect.io/blog/monitoring-serverless-functions-tutorial
As long as we’re isolating individual services that can be executed on lambdas, then we can observe/monitor like we do any traditional software.
For example, with AWS, we can use cloud watch to track logs, traces or send additional data to third party observability tooling (Grafana, Datadog).
Not wrong, but still needs additional logs & traces from another system to be watched.
And when using Orchestrators, you are right, logs could be very easy to manage for external services.
Also, Microservice architecture could be very ideal for cost reduction (reduce time to keep the systems up).
Maybe we should create a PoC to shock the community
I don't think data teams will move in that direction because of the skills and mental models they bring. As you said most are not software engineers and embrace managed services which get better by the day. Python will remain the high level interface. Most DEngs I know want to get things done and not think too deeply about implementation details. Obviously it's still great to have the flexibility to choose a different model. Maybe when they get older and bored but by then they may be managers or accepted their fate.
this can be a challenge or opportunity for any SE-based to move into DE. I think it would vital added-value for any DE to have such SE-based implementation knowledge in the field.
Daniel, what do you think about a comparison of implementing microservices with Lambda vs native Spark or Orchestrator.
This could reduce cost drastically..