Machine Learning Audit
While it has almost become table stakes for a business to claim that it uses AI; however, data technologies also present more risks than traditional software, not only due to issues of scale and increasing concerns of privacy, but also due to the lack of good established processed in developing data products. Even companies with great data science teams must sometimes plug in human intuition until they can collect enough data to automate. And investors must often judge companies based on their potential to gather and effectively use data assets. We developed the Data Science Audit service line to analyze the feasibility of a company’s machine learning goals.
From understanding their data generation, to assessing the technologies enabling data-driven decision making and modeling. Over a 72 hr period, we will liaise with the target’s business and data science leads, conduct data/code reviews, assess processes around machine learning and analytics projects, and interview technical talent at each level of the hierarchy.
1. Org Analysis
A useful data audit starts with a thorough understanding of where data are used in the organization. Are data professionals centralized or distributed? Do you have an established data science team or citizen data scientists in each functional group? After going through an org chart with the CEO, we meet with the data science and analytics leads to understand their philosophy on hiring, managing, and driving result. We collaboratively build a schedule, picking the key individual collaborators to interview and setting common-sense boundaries on code review. This process builds trust and reveals deeper learnings than the typical adversarial approach.
2. Tech Review
Technology review – Tech DDs often frustrate target teams because the wrong talent is used. We don’t throw full stack generalists or junior talent at audits. All Brandt data audits are executed by a Chief Data Scientist quality data scientist, hand-picked for the problem domain and modeling techniques they’ll be encountering.
- Data Platform – in this process, we analyze how approachable the data are in driving business value through via insights or models. This review encompasses database audits, application analytics, and documentation as well as interviews with key individual contributors to assess how much friction there is between a question and a data-backed answer.
- Machine Learning Pipelines – the dirty secret in data science is that a lot of work is done in ad-hoc scripts, and most results are not reproducible. With this review, we look to assess the resilience of the model-building process. We interview key data leads to understand the process of going from research to production.
- Data Quality – Garbage In, Garbage Out. This is a core tenant of machine learning. Because data storage has become inexpensive, companies now store all of the data they can get their hands on. However, simply storing data does not mean it will be forward-compatible with future use cases. In this review, we analyze the collection and processing of data to assess if the data they collect are structured to be usable to produce insights and train models as the product evolves.
We contextualize this analysis in terms of feasibility of a CEO’s roadmap to help investors understand the uncertainty around the target company’s goals with data. For example: is the CEO’s characterization of their technical talent correct? Is the CTO’s roadmap/timeline feasible given the available data, data science talent, and the state-of-the-art in modeling? If you invest, can the existing team adequately attract and evaluate incoming analytics talent? Or would the code be impossible for outsiders to work on because they run from R scripts from the data scientists’ laptops?
Jourdan has an old man’s wisdom alongside a young man’s energy. Though he is much younger, I often turn to Jourdan for the benefit of his clear thinking and his shockingly deep knowledge on many topics. I’m surrounded by smart people, but Jourdan is a singularity.
Mike Edelhardt, General Partner, Social Starts VC & Joyance Partners | $85M funds under mgmt
Jourdan and his team brought rigor to our diligence process. As a result of their work, we were able to make smart passes on a few initially tantalizing machine learning opportunities. We later contracted with Brandt & Co. to conduct product workshops with 2 portfolio companies, which were fruitful and well received.