2023-07-07 - Data Engineering #4 - Data Lakes - Chewing the Fat 📈 #5946
Replies: 73 comments 2 replies
-
|
Beta Was this translation helpful? Give feedback.
-
EDIT: Updated to account for the change in question.
How used is a feature in comparison with the effort expended in development
Source control and the end system
Depends if any Transformation is needed. None, then a DW. else DL OLD: |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
I would want to know whether we can draw any conclusions about how the intersection of all the factors that feed into this data impact productivity. It would be cool to see if we can identify for example environmental factors in the offices that affect this. We of course need to be aware that correlation does not equal causation - one obvious example is that it would be easy to look at the data and say that the more time we spend on internal products, the less money we earn (when in fact the causal relationship is the other way around). I'm more interested in what makes developers happy and comfortable. And obviously this impacts revenue and the profitability of the business. It would also be cool to see if we can identify client personas or verticals that yield the best outcomes, and this would help us focus our efforts on driving engagement in those areas, or shoring up the gaps.
All of them.
Both. A Data Lake to aggregate all the data and feed into the analysis, and a Data Warehouse for reporting. |
Beta Was this translation helpful? Give feedback.
-
A cool question that could be asked to this data might be: "Can we predict a project's success based on historical timesheets, project data, associated skills and technologies, and correlating factors such as employee arrival/leave times, device usage, and website traffic during the project's timeframe?" For this question, I'd be interacting with the following systems:
Considering the variety, volume, and the potential semi-structured nature of this data, we would likely benefit from using a Data Lake to store the raw data. Then, we could leverage a Data Warehouse to store the processed, cleaned, and structured data ready for analysis. 🤖⭐ |
Beta Was this translation helpful? Give feedback.
-
A cool question would you ask this data? Be creative! |
Beta Was this translation helpful? Give feedback.
-
Show me SSW's commits per day over time
2
Warehouse |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Interesting stuff!! |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Great video, it gives a better understanding of the role of Data Engineering and Data Scientist.
|
Beta Was this translation helpful? Give feedback.
-
Based on the CRM invoice and GitHub commits data, can we tell if there's a correlation of the productivity of a developer between when working internally and on client?
3
It would be useful to have them if we plan on making fancy dashboards or make predictions |
Beta Was this translation helpful? Give feedback.
-
Is there a relationship between PR size and delivered value?
4
Data lake - how value is defined will change based on project, context and stakeholder - being able to explore in different ways would be useful |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
How many complaints would we receive from our clients every month🧐
4 or 5 I think.
Data Warehouse |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Data engineering looks really promising, I'm keen to see what insights we can get out of SSW's data ! |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Are people who work less hours more productive in the hours they work vs people who work more hours
Maybe 3
Data Warehouse |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
A cool question would you ask this data? Be creative! How many systems would you be interacting with? Would you need a Data Lake or Data Warehouse? |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Which lead sources end up being the most profitable?
3
Data warehouse |
Beta Was this translation helpful? Give feedback.
-
Does the device you choose to work with (Windows/Mac etc), impact any metrics of statistical significance?
Possibly all of them
Data Warehouse. |
Beta Was this translation helpful? Give feedback.
-
Incredible video! What a great explanation. |
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Hey data engineers!
Let's talk about this rule
The rule has this Scenario: Imagine you work at Northwind. They have all these systems:
Comment below for what you think would be:
Beta Was this translation helpful? Give feedback.
All reactions