Speaker: Michael Rude, Chief Operating Officer at Crux Informatics
Tell us a little about Crux Informatics.
Michael Rude: operational needs. The best way to think about Crux is that we build scalable data pipelines between data suppliers, which could be any external data source and data consumers. We ensure that the end users get the data they need, how they need it and where they need it, including BigQuery. Currently we're delivering over 14,000 datasets from 160 data suppliers at a low cost and efficiently for our consumers. Data delivered over Crux can be personalized and it can be validated for quality enabling analytics-ready data in BigQuery.
What data challenges were you facing as an enterprise?
Michael Rude: With regards to data challenges specifically at Crux As you can imagine, we're processing hundreds of terabytes, if not thousands of terabytes per day. It's critical for us to have a scalable infrastructure that is also cost effective. Indeed, we needed to select a cloud data warehouse that not only fit our current storage requirements but met our future requirements as we're adding more suppliers and more data every day. And we anticipate supporting thousands of data suppliers.
What solution and architecture did you decide on and why did you decide to choose Google Cloud?
Michael Rude: Crux is 100% cloud native and a little known fact we were born in Google Cloud, but our original cloud data warehouse was not BigQuery. We made the switch to BigQuery last year on the back of a discovery exercise with the product teams at Google. We found that there are a number of advantages of running all of the data on the Crux platform through BigQuery and we'll discuss some of those advantages shortly. But performance and BigQuery's pricing model were among the biggest factors for the move.
How do BigQuery and other GCP products help Crux?
Michael Rude: Beyond pricing, choosing BigQuery provides other advantages to Crux, including the ability to integrate and access other Google Cloud tools and services. Some examples include: Crux can data share to any Google client, enabling clients to choose when and when not to materialize data. We use Google's Pub/Sub to send electronic notifications to clients so that they can build automation around their data processing. We use BigQuery's data sharing capabilities to share schemas and semantic metadata making data on BigQuery easier to use. We're even ingesting live FX prices directly into BigQuery for consumers to ingest. Lastly, BigQuery's support for cross-region replication without any additional costs is incredibly powerful. At the end of the day, we're making for a delightful experience for mutual clients: cheaper, faster, better access to data and Google Cloud.
What’s next for you on your data journey?
Michael Rude: We continue to see growing demand for external data. Every client we talk to seeks to integrate third party data with internal data to make better decisions across all of their use cases. These same clients are also daunted by the challenge of integrating with third party data suppliers, wrangling that data into submission and extracting insights. For firms doing this in house it's a linear problem solved only with more resources and this is where Crux comes in. We're the easy button for accessing accessing more data faster. We are managed service today, but we are quickly rolling out self-service tools that will enable our clients to use the managed service for non sensitive data and self-service for internal data. Our clients will use Crux as an enterprise solution for integrating and operating all of their data on Google Cloud.
Listen to more Google Cloud customer data journeys ?
https://g.co/cloud/datajourneys