As I reflect on the journey of building Oatfin, I can’t help but feel an immense sense of gratitude. Here are a some things I’m thankful for as an early-stage startup founder:
1. Supportive Network: Grateful for the mentors, advisors, friends, and family who’ve been with us from the beginning. The guidance, encouragement, and connections have been invaluable.
2. Learning Opportunities: Every challenge is a chance to learn and grow. Gratitude for the experiences, both good and tough, that shape us and make us more resilient entrepreneurs.
3. Early Adopters and Customers: Huge thanks to our early adopters and customers! The feedback so far has been gold, and we’re committed to delivering value that exceeds expectations.
4. Resilience and Perseverance: Building a startup tests us in ways we never imagined. Grateful for the resilience and perseverance that keeps us moving forward, even when the road is tough.
5. Access to Resources: From funding to workspace and technology, having the right resources is a blessing. Thank you to everyone who has contributed to our journey in any way.
6. Market Validation: Excited to see positive responses from the market. It’s a sign that our vision resonates, and there’s a demand for what we’re building. This validation fuels our determination to keep pushing boundaries.
Here’s to the journey ahead, filled with challenges, triumphs, and continued growth!
Thank you for subscribing to Cloud Musings! If you haven’t subscribed yet, please consider subscribing on LinkedIn and Substack to receive automatic updates when I publish new editions.
It’s been a while since our last regular update, but we’re excited to announce that we’re getting back into our monthly cadence. As we continue our seed fundraising journey, we’d greatly appreciate introductions to more investors and enterprise partners who share our enthusiasm for what we’re building at Oatfin.
Over the past few months, we’ve been hard at work planning the next version of our product and crafting a roadmap. Since the launch of Oatfin Cloud, we’ve received invaluable feedback from our users. One topic that continually surfaces as a challenge for our customers is cloud cost management. While AWS offers numerous benefits, managing costs can often be a significant hurdle. For example, one startup I worked at, we spent north of $30,000 per month without a clear understanding of why.
Introducing Oatfin Cloud Cost Intelligence (Beta)
This week, we launched Oatfin Cloud Cost Intelligence in beta, which uses AI to simplify cloud cost management on AWS.
Here are some of the key challenges we’re addressing:
Unpredictable Costs: One of the most significant issues with AWS is the unpredictability of costs. Usage can vary from month to month, and it’s challenging to estimate how much services will cost.
Hidden Costs: AWS has hidden costs, such as data transfer fees, storage costs, and charges for additional services or features. These can add up quickly and catch organizations by surprise.
Lack of Cost Visibility: Many organizations struggle with a lack of visibility into their AWS spending. Without proper monitoring and reporting tools, it’s challenging to identify cost-saving opportunities.
Complex Pricing Models: AWS offers complex pricing models with various pricing tiers, discounts, and options. Understanding these models can be challenging, making it easy to overspend.
Lack of Cost Accountability: Without proper cost allocation and accountability mechanisms, different teams or departments within an organization may overspend without realizing it.
Solving the cloud cost problem using AI is indeed complex, but we firmly believe that AI can help optimize cloud costs by predicting usage, recommending resource allocation, and identifying cost-saving opportunities.
Our Approach in a Nutshell:
Data Collection: We collect data on AWS cloud usage, such as historical billing data, resource usage metrics, and other pertinent information.
Data Preprocessing: We clean and preprocess the data to make it suitable for AI modeling, removing irrelevant data and normalizing values.
Feature Engineering: We create meaningful features from the data, such as CPU usage, memory usage, or user behavior, which can be used for modeling.
Model Selection: We choose an appropriate AI model for the problem, with options like time series forecasting, regression, and reinforcement learning models.
Model Training: We train the selected AI model on the preprocessed data.
Cost Prediction: Using the trained model, we predict future cloud costs.
Recommendations: Based on the model predictions, we generate recommendations to optimize cloud costs.
Monitoring and Continuous Learning: We continually monitor the cloud environment, update the model with new data, and adapt recommendations as the environment changes.
We welcome any feedback you may have and invite you to participate as beta testers.
Thank you for reading and being a part of our journey at Oatfin!
As promised, in this week’s edition of Cloud Musings, I thought I would do a deep dive with code into dynamic scheduling and explain how we solve this challenge. Don’t forget to subscribe here on Substack of Linkedin. Thanks for reading!
Last week, I wrote about this on a very high level. Here is a demo of how it works. I’ve also started to open source some of our code base so people can understand how our platform works.
We have a pretty solid architecture:
Python, Flask API
Celery, Redis task queue
React, Typescript Frontend
Docker on AWS ECS and Digital Ocean
To schedule a deployment, a user specifies a cloud infrastructure, the date, time, and dependency. Dependency is optional, but we could imagine the case of deploying the API before a change in the UI or a database change before deploying the API. When a user specifies a dependency, it runs 15 minutes before the actual deployment.
First we capture the parameters that the UI sends, then call the service to create the actual schedule. If the user specifies a dependency, we also create a scheduled entry in MongoDB and Redis for the dependency.
The DeploymentSchedulerService is used to first translate the user date and time from their timezone to UTC, then it creates an entry in Redis. We’re also using crontab from celery to create the actual schedule. The challenge here is that we can only specify month_of_year, day_of_month, hour, and minute. We can’t specify a year. We handle this by deleting the entry from Redis once the scheduled deployment is successful.
In this edition of Cloud Musings, I thought I would dive into user analytics from Google Analytics from the last couple of years. Some people find it useful, but I think we are still early to have meaningful user data.
As we get more data, we will want to double down on the channels that bring the most bang for the buck. We are not there yet. In fact, we have done no marketing or any kind of advertising. Ideally, we’ll be doing marketing where developers hang out like Github, Gitlab, etc.
For the analytics, blue bars represent 2021 and the red bars represent 2022. I’m doing this for both the website as well as the app. I also made a quick video to show this data from Google Analytics here.
Thanks for reading Cloud Musings! Subscribe for free to receive new posts and support my work.
As we can see from 2021 to 2022, we had a massive user growth. In particular, we went from 482 new users in 2021 to 1905 new users in 2022. The number of sessions and page views grew accordingly.
Diving into the different acquisition channels, a lot of the traffic came from direct hits. My guess is that a lot of people prefer to type in the browser as opposed to click on a link. I personally do that especially when Google search shows an ad, but I don’t want to click on the ad, so I would type in the address in the browser. Also, I think Google Analytics count email as direct hits. We did a lot of email cold outreach.
As for social channels, I only use Twitter and LinkedIn. It’s not a lot of hits, but it’s still significant to see that people are coming from Twitter and LinkedIn.
For the app, I’m using Google Analytics v4 and I’ve found it less useful than Google Analytics v3. There isn’t as much details to dive in. Here we see the user growth.
Here you can see the different user acquisition channels. Again, a lot of the users come from direct hits, few organic search, and referrals for both 2021 and 2022. That’s very much what I expected.
Looking forward to dive into next week’s Cloud Musing newsletter on dynamic scheduling. This week, I will talk about User Acquisition and Growth. If you like this kind of content subscribe to our newsletter: https://lnkd.in/e3Xj4qhG
Dynamic scheduling is one of the biggest challenges developers face. For example, I want to deploy the Oatfin API on February 10, 2023 at 3:00 AM. Dynamic scheduling is hard because there is little support for it out of the box. With that also, as every developer knows, timezone is very hard to deal with.
Static scheduling on the other hand is very simple, every programming language provides some kind of support for writing cron jobs. An example of a cron job: I want to import data from a vendor every day at 8:00 PM.
Here is very high level on how we deal with this problem:
For the frontend: 1. To keep it simple, a user specifies year, month, day, hour, and minute from their own point of view. 2. We use the moment-timezone npm package to guess a user’s timezone from the browser.
Backend: 1. We run 3 docker containers: celery, scheduler and api. Celery is the base image and the other 2 containers extend the base image and overrides the CMD directive in the Dockerfile. 2. We use MongoDB to store the exact schedule in the database with the user’s timezone. 3. We use celery as a task queue and celery-redbeat as the scheduler. Celery-redbeat provides a RedBeatSchedulerEntry object to insert schedules in Redis. When we insert a schedule in Redis, we translate the user’s schedule to UTC date and time. 4. Once the task is complete, we mark it complete in MongoDB, which removes it from the list of scheduled deployments. 5. When a user cancels a task, we delete this entry from Redis and it won’t run.
As promised, this newsletter would be very technical sometimes to target the technical audience. Don’t forget to subscribe for more updates. It’s 36 subscribers strong on LinkedIn. Thank you for reading!
It took me the better part of the weekend to get the Celery, Redis task queue and scheduler to work, but it’s working now! Happy to talk about some of the challenges! This assumes familiarity with AWS Elastic Container Service, message brokers, and task queues.
We have a pretty solid architecture:
Python, Flask API
Celery, Redis task queue
React, Typescript frontend
Docker on AWS ECS and Digital Ocean
What is Celery?
Celery is a task queue with focus on real-time processing, while also supporting task scheduling. It is a is a simple, flexible, and reliable distributed system to process large amounts of messages. It works with Redis, RabbitMQ, and other message brokers.
Some of the challenges I came across:
First, it took me a while to connect to Redis (Elasticache) on AWS. You have to add inbound rules to both security groups in order for the API to communicate with Redis over a specific port like 6379, but it didn’t work for me. I ended up using Redis Cloud because it is a lot simpler than AWS Elasticache. Another solution would be to run Redis on a Digital Ocean droplet or AWS EC2 instance, but I would have to expose the IP and port to the outside world.
The next challenge was how to get the Celery container to run on AWS Elastic Container Service. There are a couple ways to make it work:
Multiple containers within the same task definition
But this was not necessary because the Celery container doesn’t have to scale like the API. ECS also requires a health check path, but there isn’t one for the Celery container, which meant that starting a separate cluster was out of the question.
The solution was to create a multi-container deployment: a base image for the Celery task queue and a main container image for the API that builds on top of the base one. The API image simply overrides the CMD directive in the docker file.
Here is what this looks like:
Base Celery Container – Dockerfile.celery
RUN mkdir -p /usr/src/app
ADD ./requirements.txt /usr/src/app/requirements.txt
RUN pip install --upgrade pip
RUN pip install --no-cache-dir -r requirements.txt
ADD . /usr/src/app
CMD celery -A apis.deploy.tasks.tasks worker -l info;
Flask API Container builds on the base image – Dockerfile.staging
I installed docker on a Digital Ocean droplet and ran the Celery containers on it, one setup for staging and another for production. It works as long as Celery can connect to Redis. In fact, I ran the Celery container locally and it worked. That’s how I figured it out. We could spin up an EC2 instance and run docker on it, but it was cheaper to go with a Digital Ocean droplet.
Building and running the containers is very trivial from there on. First login to a Digital Ocean droplet with docker installed then:
Login the docker registry. I’m using Gitlab’s docker registry.
Finally run the container on the Digital Ocean droplet.
docker run registry.gitlab.com/oatfin/automata:celery
Once everything works correctly, then we get some nice logs showing that celery is running. Here I’m using host.docker.internal to connect to my local Redis. I didn’t want to show the Redis Cloud link because anyone with the Redis link can intercept your messages.
Thanks again for subscribing to Cloud Musings! Last I checked, it was 38 subscribers strong. If you haven’t subscribed, subscribe to get automatic updates when I publish new editions. I will try to make it interesting sometimes!
This week I got the book Start With Why by Simon Sinek from Darrel, one of our investors at Visible Hands. Darrel along with Daniel were some of our first believers. I started reading the book and thought I would take a step back to talk about why I’m working on Oatfin.
The short answer is that I left Akamai in 2016 and wanted to focus on startups. I realized that I was not a corporate person. I come from a family of entrepreneurs. My parents are entrepreneurs and my grandparents were also entrepreneurs.
First, I started working on a fintech/blockchain solution and it was a pain dealing with infrastructure. Back in 2016, blockchain was also blacklisted by every major cloud and payment forms like Stripe. I took a break to work at a few startups to learn.
For the long answer. I’ve been a software engineer for 15 years. In my experience working at GE, Putnam Investments, Akamai, and many early stage startups, cloud infrastructure was a major challenge. The process is not only manual, but tedious and frustrating at times. If you’ve ever used the AWS or Google Cloud user interface, then you know this pain well!
For example, I was working at a fintech startup and one of my roles was to automate our cloud infrastructure. Sometimes it took days to deploy a simple app.
I was working at another healthcare startup, and it was a lot of the same frustrations. We moved from servers to server-less and then back to servers. Other challenges we faced were cloud spend, testing, security, compliance, and observability into the server-less apps. I left after 6 months to work on Oatfin because I believe the process should be simpler.
Some problems with the cloud currently:
Painfully slow development cycle.
Manual, tedious, time consuming and frustrating.
Days to weeks to build a secure cloud infrastructure.
Vendor lock-in means high cost.
Requires expert knowledge, new staff and skills.
There are some solutions like infrastructure as code (IaC), but I don’t believe developers want to write more code to maintain the code they already have. I’ve written a fair amount of infrastructure as code. Some problems include:
Hard to maintain manual scripts – multiple environments and clouds.
Learning new frameworks and languages like Terraform and Pulumi.
Doesn’t remove the complexity of infrastructure.
Security issues with cloud credentials and secrets in code.
With Oatfin, our mission is to improve developer productivity and the development experience. We make it simple and easy to manage any application on any cloud.
Where the name comes from: the “oat” piece is because I love oatmeal. The “fin” is for fintech. Since I already had the domain name, Twitter and LinkedIn handles, it all stuck around. I also wanted to choose something that I could create the Google search presence for as opposed to something that would be confusing.
There are 3 big tenets in our application: infrastructure automation, security, and compliance. Our focus is cloud native, Docker, and Kubernetes.
Why cloud native?
Containerization provides many benefits like porting to different clouds, different operating systems as well as being easier to scale. There is no doubts that more enterprise companies will take advantage of cloud native deployments as they continue to use the public cloud.
Currently, the app allows customers to define their containers from any public or private container registry. We automate the infrastructure so they can choose the type of infrastructure they want to create. Since we have the containers, we can also scan them to detect vulnerabilities and compliance.
There are many features that make us stand out:
Being able to clone an infrastructure to debug production issues.
Schedule a deployment and specifying dependencies that need to be deployed before.
Zero trust security.
Scanning containers for vulnerabilities.
Our target customers are enterprise companies. As a startup, deploying native applications is very simple. You are most likely deploying a single API with a single container. But as an enterprise company, things get complicated very quickly with very little visibility. For example, you might have an API running on AWS, a database running on premise, and some other pieces running on Google Cloud. Managing these hybrid and multi-cloud environments is very challenging.
The Oatfin architecture is a good example. We have a Python and Flask API that talks to MongoDB with a Celery task queue. The Celery task queue uses Redis as a backend and message broker. The API is deployed on AWS using Elastic Container Service (ECS), the database is deployed on MongoDB Cloud, which is on Google Cloud, and we have Redis running on Redis Cloud. Finally our frontend is running on DigitalOcean along with the Celery task queue.
With that said, we’re raising our seed round and I would love to connect with investors who are excited about the cloud and developer tools space.
I’m hoping to write weekly on Sunday and talk about the features we ship weekly, fundraising, customer wins, programs, etc. I find that forcing myself to write about the week forces me to get stuff done to write about!
It will be very technical sometimes to target the more technical audience.
It’s been 2 years since I incorporated Oatfin, but I feel now we’re really making progress. The last couple years, we went through a few programs that prepared us. In 2021, we went through the Google for Startups Founders Academy. That was a 6 months program and culminated in Oatfin getting funded through the Google for Startups Black Founders Fund.
In 2022, we went through Lightship Capital’s Bootcamp, then we did another program with Google called Black+ TechAmplify. Late 2022, we also started a year long development program with Accenture where we have the chance to do paid pilots with their clients. In the last quarter of 2022, we got accepted to Visible Hands’ accelerator program and got our pre-seed funding.
We kicked off our seed fundraising in October 2022. We have a number of investors who are interested, but we have to get a lead investor who is excited about the space. We are actively looking for a lead investor and would love introduction to investors who are excited about the cloud and developer tools space.
Last week we shipped more features:
Chain deployments. For example, service A depends on service B. Scheduling service A for deployment automatically runs service B 15 minutes before. Some use cases: making a database change before deploying an API or deploying the API before deploying the UI.
Enable team management and invite team members to collaborate on Oatfin. Our business model is very simple: we charge $249/user per month for the SaaS model billed yearly and $2,999/user per year for the on-premise solution.
This week, I’m finishing up the deployment scheduling and tackling compliance automation, but the heavy lifting is targeted for February. Since we have the cloud infrastructure, compliance automation is the next logical step. Compliance is a major problem for a lot of enterprise companies as well as startups.
I’m also looking forward to 2 great programs this week that will hopefully help us get to the next level starting both January 25:
AWS CTO Fellowship is a 5 weeks program for seed stage startups. It is a community of over 3,000 and growing early-stage and venture-backed CTOs. It is designed to provide early-stage CTOs with technical resources, guidance and community. The program consists of short weekly sessions with CTOs from top late-stage startups covering a different theme each week.
Bolster Ready to Raise is the first Bolster for Startups program in 2023. They are partnering with Jenny Fielding of The Fund to help founders work through a tight fundraising process.
With cloud native, one of the major challenges is observability and being able to debug production issues quickly. But you can’t really debug your app while it’s being used by users in real time.
It’s nice to have a temporary environment that is as close to production as possible, but spinning up and tearing down a cloud infrastructure quickly is a major pain for developers.
One of the cool features we just shipped at Oatfin is the ability to clone an infrastructure or environment one-to-one to make it easy to reproduce and troubleshoot production problems.
In solving this problem, it would have been nice to have a COPY or CLONE operation in REST. Something I think is fundamental to almost every API.
For now I’m doing:
If the “source” parameter is present as a query parameter, then it’s a clone operation, otherwise it’s a create operation. The operation is not idempotent meaning it will create a new copy each time the API is called.
Lots of great things happened in 2022, but it was no doubt one of the most brutal year given the recession, mass layoffs, and investor pull back. I learned a lot about fundraising, business, and technology.
Where I lacked a lot was building relationships with investors. Further complicating things was the Covid pandemic. Building a company when there is a raging global pandemic was challenging.
Some of my focus areas for 2023:
Expand my knowledge about Venture Capital/Financing
Grants like the NSF SBIR/STTR program
Expand my knowledge about business
Startup business valuation
Pitch deck/Business plan
Selling and marketing to enterprise businesses
Expand my knowledge about Product Management and Technology