September was a big month for marketing data infrastructure at Datacoral, especially in the context of our partnership with Amazon Web Services. At the beginning of the month we conducted and published our first webinar recording:
This event covered how Datacoral goes beyond popular cloud-based ETL and ELT products to support a cost effective, scalable and compelling data infrastructure platform within AWS using native AWS services.
Beyond supporting AWS-best practice ELT centered around S3, Redshift and Athena, the five additional requirements are:
- Serverless architecture using AWS Lambda Functions as a Services (FaaS) which offers the most cost-effective pay-per-use model in AWS, allowing you to avoid over-provisioning your capacity.
- Secure deployment in your own Virtual Private Cloud where you control your keys and your data. We don’t see it.
- Orchestration and timing coordination throughout the pipeline such that upstream dependencies are understood before the pipeline flows.
- Change awareness such that, when the pipeline is flowing, changes to schema and data values are caught and do not create downstream breakage.
- Publishing your curated and refined data to analytic, machine learning and operational systems, essentially getting the data to where it’s needed.
Datacoral’s founder, Raghu Murthy was featured on Dan Woods’ podcast at EarlyAdopter.com, talking about the origin of Datacoral, and the advantages of deploying serverless data integration technology.
We are very pleased to see that Amazon Web Services has published our blog about how SQL is the data programming language used to build data pipelines in Datacoral.
This article builds upon our earlier data programming series, and does a nice job of illustrating how we use SQL as the programming interface combining it with header comments which are key/value setting that tell Datacoral how to process the query. When we deploy it in our system we call it a Data Programming Language (.dpl) file, but it’s just your query with Datacoral instructions in the headers.
We also just ran a webinar that profiled five of our customers, Greenhouse, Front, Jyve, Swing Education and Cheetah. Characterized as Data Innovators, these fast growing organizations are inventing transformative business models for the gig economy, logistics, mobile, collaboration, human capital management and artificial intelligence. A recording of this event is available here, and becomes the building block for what we will introduce next week.
On October 10th, we will unveil our next initiative, the Data Infrastructure for Startups Program designed to help early stage technologists overcome the inevitable issues in tapping their data resources. The common, cost-conscious mindset is to use a combination of open source and ingenuity to build out initial data infrastructures. While it may be ok to trade an engineer’s time for money saved, at some point, fairly quickly, that becomes a maintenance burden for that engineer that further bogs down their productivity.
This is a very common problem for data engineers who are building and healing data pipelines, and this program will help resolve that, just as we did for the startups we featured earlier this week.
As we roll this out, we will also feature data solutions that we have implemented including:
- GAAP Reporting for Startups,
- Churn Candidate Analysis,
- Customer Usage and Satisfaction Indicators,
- Data Product Planning Statistics, and
- AB testing analysis, and
- Regulatory compliance for GDPR and CCPA