#+TITLE: Duree: automated universal database #+SUBTITLE: seeking pre-seed funding #+AUTHOR: Ben Sima #+EMAIL: ben@bsima.me #+OPTIONS: H:1 num:nil toc:nil #+LATEX_CLASS: article #+LATEX_CLASS_OPTIONS: #+LATEX_HEADER: #+LATEX_HEADER_EXTRA: #+LATEX_COMPILER: pdflatex #+DATE: \today #+startup: beamer #+LaTeX_CLASS: beamer #+LaTeX_CLASS_OPTIONS: [presentation,smaller] * Problem Developers spend too much time managing database schemas. Every database migration is a risk to the business because of the high possibility of data corruption. If the data is modeled incorrectly at the beginning, it requires a lot of work (months of developer time) to gut the system and re-architect it. * Solution - Using machine learning and AI, we automatically detect the schema of your data. - Data can be dumped into a noSQL database withouth the developer thinking much about structure, then we infer the structure automatically. - We can also generate a library of queries and provide an auto-generated client in the choosen language of our users. * Existing solutions - Libraries like alembic and migra (Python) make data migrations easier, but don't help you make queries or properly model data. - ORMs help with queries but don't give you much insight into the deep structure of your data (you still have to do manual joins) and don't help you properly model data. - Graph QL is the closest competitor, but requires manually writing types and knowing about the deep structure of your data. We automate both. * Unsolved problems - Unsure whether to build this on top of existing noSQL databases, or to develop our own data store. Could re-use an existing [[https://en.wikipedia.org/wiki/Category:Database_engines][database engine]] to provide an end-to-end database solution. * Key metrics - How much time do developers spend dealing with database migrations? What does this cost the business? We can decrease this, decreasing costs. - How costly are failed data migrations and backups? We reduce this risk. * Unique value proposition We can automate the backend data mangling for 90% of software applications. * Unfair advantage - I have domain expertise, having worked on similar schemaless database problems before. - First-mover advantage in this space. Everyone else is focused on making database migrations easier, we want to make them obsolete. * Channels - Cold calling mongoDB et al users. * Customer segments - *Early adopters:* users of mongoDB and graphQL who want to spend time writing application code, not managing database schemas. The MVP would be to generate the Graph QL code from their Mongo database automatically. - Will expand support to other databases one by one. The tech could be used on any database... or we expand by offering our own data store. * Cost structure ** Fixed costs - Initial development will take about 3 months (~$30k) - Each new database support will take a month or two of development. ** Variable costs - Initial analysis will be compute-heavy. - Following analyses can be computationally cheap by buildiing off of the existing model. - Customer acquisition could be expensive, will likely hire a small sales team. * Revenue streams - $100 per month per database analyzed - our hosted service connects to their database directly - includes client libraries via graphQL - may increase this if it turns out we save companies a lot more than $100/mo, which is likely - enterprise licenses available for on-prem - allows them to have complete control over their database access - necessary for HIPAA/PCI compliance