Saturday, October 12, 2024

From Concept to PRD: My Journey Collaborating with AI on a RAG-based Extraction System

I recently embarked on an exciting project to overhaul our document attribute extraction system. What started as a simple idea quickly evolved into a comprehensive plan for a cutting-edge RAG-based system. Throughout this journey, I collaborated with an AI assistant, and I want to share how this partnership helped shape our project roadmap.

The Initial Concept


It all began when I approached our AI assistant with a straightforward question: "Code to fine tune a llama 3.2 model for nested structured data extraction from pdf files. What should be the training dataset format?"

Little did I know that this simple query would spark a series of discussions that would completely transform our approach to document extraction.

Embracing RAG: A Game-Changer

As we delved deeper into the possibilities, the AI suggested implementing a Retrieval-Augmented Generation (RAG) approach. This concept immediately piqued my interest. We explored how RAG could enhance our system's ability to handle complex, lengthy documents while maintaining high accuracy.

The AI provided a detailed explanation of how we could structure our system:Chunk the input document
  1. Create embeddings using an E5 model
  2. Generate synthetic answers with a fine-tuned LLaMA model
  3. Retrieve and re-rank relevant chunks using ColBERT v2
  4. Extract attributes from the top-ranked chunks

This approach seemed promising, but I had concerns about performance, especially for transient documents that require quick processing.

Optimizing for Speed and Accuracy

Addressing my concerns, we brainstormed ways to optimize the system for transient documents. The AI suggested implementing a "fast track" pipeline that uses lighter models and skips some computationally expensive steps. This solution struck a balance between speed and accuracy, potentially processing transient documents 50-70% faster than the full pipeline.

Expanding Capabilities: Dependent Data Extraction

As we refined our plan, I realized we needed to handle more complex scenarios. I asked, "Can this be used for doing dependent data extraction? Like find a set of ids and extract specific set of attributes for each of those ids?"

The AI's response was enthusiastic and detailed. We worked together to design a two-stage extraction process:
  1. ID Extraction: Identify and extract a set of IDs from the document
  2. Attribute Extraction per ID: Perform targeted attribute extraction for each ID
This feature significantly expanded the versatility of our system, allowing it to handle nested data structures common in many business documents.

Bringing It All Together: The PRD

As our ideas coalesced, I asked the AI to generate a Product Requirements Document (PRD). The resulting document was comprehensive, covering everything from key features and technical requirements to performance metrics and potential risks.

What impressed me most was how the PRD evolved through our conversation. When I requested updates to include new features or address specific concerns, the AI quickly incorporated these changes, resulting in a well-rounded, thoughtful project plan.

Lessons Learned

Reflecting on this experience, I've gained valuable insights into collaborating with AI:

  1. Iterative Refinement: Our initial idea evolved significantly through back-and-forth discussion. Don't be afraid to explore tangents or challenge the AI's suggestions.
  2. Leverage AI's Knowledge: The AI brought up concepts and technologies I hadn't considered, like using E5 for embeddings and ColBERT v2 for re-ranking. This broadened our solution space.
  3. Human Expertise is Crucial: While the AI provided extensive technical knowledge, my understanding of our specific needs and constraints was vital in shaping a practical solution.
  4. AI as a Brainstorming Partner: The AI excelled at generating ideas and fleshing out details, making it an excellent brainstorming partner.
  5. Clarity in Communication: Being clear and specific in my queries led to more targeted and useful responses.

Looking Ahead

This collaboration has set us on an exciting path. We now have a solid plan for a RAG-based document attribute extraction system that promises to be more accurate, flexible, and efficient than our current solution.

As we move into the implementation phase, I'm confident that the groundwork we've laid through this AI-assisted planning process will prove invaluable. It's a testament to how AI can augment human creativity and expertise, leading to more innovative and comprehensive solutions.

The journey from a simple question about dataset formats to a full-fledged PRD for a cutting-edge system has been enlightening. It's clear that AI assistants like the one I worked with are not just tools for answering questions, but partners in the creative and strategic thinking process.

I'm excited to see how this project unfolds and look forward to sharing more insights as we bring our RAG-based extraction system to life!


---


PS: This post and the entire communication was done with Claude Sonnet 3.5 Model.

Wednesday, June 2, 2021

Review of Intraday Trade Plan of 1st June 2021

Yesterday evening, I wasn't sure if I wanted to trade today. I still ended up trading because I had some time. Long story short, I came out positive today. Got burnt multiple times on both sides during reversals. Finally took a very risk free trade which pushed me to green towards the end. 


Ended the day with about 0.58% of the capital deployed. I guess for someone whose been trading without leverage for a while, the new changes didn't affect me much based on the Basket Order Analysis I did in the morning. 

Monday, May 31, 2021

Review of Intraday Trade Plan of 31st May 2021

This is a review of the trade plan that I posted here

Few points to highlight based on the trade plan

I'm still afraid of profit booking and a downward fall before we continue to edge higher.

This played out right at the open. We opened flat and went down, before we got bought up rapidly with very high bullishness until at the end. We were also dealing with a narrow CPR for the day. I started late today after the 3 red candles, so I started the day with 15350 PE instead of the original 15300 PE as planned. Also because we had a huge downside, I started with 15600 CE on the top instead of the 15650 CE as planned. At around 10.30 or so, when we crossed 15500, the 15600 CEs started to expand like crazy. So I switched to 15700 and eventually to 15750 as we started taking each resistance throughout the day. As I was climbing up, I also switched up the PEs by booking profits from all the way to 15450 towards the end. The last strangle I was operating was at 15450 PE and 15750 CE. 

The call and put premiums seem to indicate a downward bias throughout the day. We saw 2 massive red candles at close which might be the beginning of the profit bookings as well. We also have large margin requirement starting tomorrow so not sure how it might affect the prices overall. 

We gained a little over 13 points for the day today, with an ROI of about 1% on the capital deployed.

All the paper trades that I took are described below:


(Click to enlarge the picture)

If you're wondering why I'm taking paper trades instead of actual trades. I'm currently learning about the market and a few strategies. The idea is to learn to use the position sizes and the PnL more comfortably before diving in with real money. This also introduces me to the whole ecosystem so I could find my way around once the learning phase is over. 

Large PUT side OI can be found at 15400, while a large amount of CALL side OI can be found at 16000. Tomorrow being a Tuesday, my workday schedule is fully packed, and we have new margin requirements as well. Might probably sit tomorrow out and see how things go. 

Trade Plan for 31st May 2021 - Intraday

We're at our ATH right now on Nifty. Also given a huge Gap Up on Friday after the monthly expiry of May, I'm still afraid of profit booking and a downward fall before we continue to edge higher. To play is save, I feel for a short strangle (or an Iron Condor) a safe range for Monday (31st May) would be 15300 - 15650. 

Base Trade - 15300 PE - 15650 CE

I'm hoping this 350 point strangle should have a little more than 40 - 45 points in them and I'm looking to capture around 8 - 10 points of decay. 

Reasons behind 15300 support

We've a very good support (OI) built up at 15400 and 15300. Even if we breach 15400, I feel 15300 would be still defended this week. We might change our view as the week progresses. 

Reasons behind 15640 resistance

Even after a huge gap up and rapid buying by both FII and DII, we made close to 140 point jump. I don't expect the 200 point difference to be taken out intraday. 

Possible adjustments

We'll move our CE down if we start heading towards 15400. Similarly if we will move our PE up once we cross 15500. Also if we gap up again on Monday cross over 15500 and not getting sold into in the first 15 minutes, then we'll probably start directly with 15350 instead of 15300 in the base trade. 

SL

The plan is not to loose more than 8 points on a Monday, since the decay isn't so great especially when we don't have any overnight positions.

Chart

Nifty Futures Chart

(Click to view the chart enlarged)

Wednesday, April 7, 2021

Trading Notes - 7th April 2021



Inverted Cup and Handle formation being formed. Might take a few days and things might take a dive to 85 - 75 range.



BEML seems like it is bouncing of a support zone around 1240. The next target of 1540 is somewhere near by. Also for the last month or so, it seems to be forming some kind of bullish flag pattern.

BSE just broke out of a descending channel pattern. We're looking at a target of 650+ with a SL at around 580. 


BAJAJCON, seems to be having an ascending triangle pattern. We're waiting for 285 breakout. Once that happens target is around 350+, with a SL around 250.



On BSOFT we wait for 285 breakout and most likely the target would be around 340+. 



CARE RATING LTD., is going through a channel bear trend. It just took resistance on the channel on today's high. Unless it is planning on breaking that channel, it is going to continue downwards.


CCL is on a bearish channel for more than 4 years now. Recently from last July, it seems to be taking support at 220-225 range. Once it crosses 255-265, then I guess we're having a true good breakout. 

Wednesday, April 20, 2016

RootConf 2016

Last week I attended Rootconf 2016. My first of RootConfs and second HasGeek event. The event was organised good compared to Fifth Elephant in 2014. You can find my personal notes from various talks at https://github.com/ashwanthkumar/rootconf-2016.

At 4PM on Day 2, I got an opportunity to go to Apigee's Bangalore office to present about Matsya. I prepared a presentation hoping to talk at one of the Flash Talks but never got a chance at the event, but it did come useful during ApiGee's session :)


Tuesday, March 29, 2016

Marathon-Alerts: Alerting tool for Marathon Apps

Another month has passed by, and a bunch of open source contributions that has happened. I recently published on a blog post on Indix's blog titled - "Marathon-Alerts: A Tool for Keeping Your Apps on Marathon Under Radar". It is an official post for open sourcing the marathon-alerts tool to the outside world. Without further delay, here's the tl;dr version of the post.

Marathon-Alerts, is a tool to monitor apps running on marathon. Marathon while reaching it's 1.0.0 milestone (as of writing), it still doesn't have first class support for alerts for tasks running via it. If you're on all containers, then there are lot of tools out there in the market which can help you monitoring and alerting (Sysdig for example).

Unlike k8s, marathon doesn't limit to running only containers but arbitrary commands too. This is where specialised tools like marathon-alerts comes into play. Today we've notifier only for Slack, but it's quite easy to extend them to other sinks which can integrate to your existing infrastructure.


Above screenshot is an example alert that comes on our slack channels when one of our apps went down because of an error. 

This is one of my best open source contributions in recent times. Check out the project on https://github.com/ashwanthkumar/marathon-alerts, it's released under Apache 2 License. Fork away!


Friday, February 12, 2016

How not to use awk

Today I was working on automating AMI creation for a project and ended up using packer. So packer has this thing called -machine-readable output which can be easily parsed as CSV (at the very least). I ended up writing a bash script to parse the output.
I started off with using while loop + awk for parsing (in that order) and emitting the lines that contains the artifact information. As you can see above for parsing around 1700+ lines it took > 6 seconds.



Afterwards I refactored the code to use awk first and feed that output to while loop, which actually ran 100x faster.


I didn't knew AWK was so good at processing things at scale (if I may). 

Saturday, February 6, 2016

Introducing Matsya

I recently wrote a blog post on Indix's Engineering blog about Matsya. Do check it out at http://www.indix.com/blog/engineering/matsya-auto-scaling-availability-zone/. I'm going to summarise the blog post here in TLDR; version.

We have 100% of our Hadoop workloads running on AWS Spot infrastructure. One of the prominent issues we've have faced is "Spot Outages". In order to solve that, we went cross AZs. This in return resulted in huge data transfer costs especially for systems like Hadoop HDFS. We realised the only way to survive this madness is to be on a single AZ but intelligently move to another AZ when there's a Spot spike in the current one. That worked :) There were cases when we couldn't find any AZ within your bid budget. Typically our bid budget was 100% of OD price. During such times, it can fallback to OD instances until the surge ceases.

I wrote a tool to automate the above process, and that's Matsya. Matsya is the name of first incarnation of Lord Vishnu in this world. The main part of the story is he takes a form of a Fish (with a horn) and carries the Seven Sages (saptarishis) through safety on the Judgement day. The name was so apt since Matsya is able to carry the cluster from one Spot Market to another (read cheapest) during a Spot surge.

Bonus - Slides to my presentation I gave at Chennai Devops meetup on January 2016 can be found at http://j.mp/to-matsya

Thursday, February 4, 2016

Building GoLang on SnapCI

I've been a long time user of SnapCI. I recently started playing around with GoLang. I started writing tests and releases on some of the hobby projects and it's high time I start setting up a CI for it. Unfortunately SnapCI doesn't support GoLang out of the box and making it work is a bit of a hassle. After fighting with it for more than a hour, I figured out how to compile my go project.

Disclaimer - My project on my local machine is part of my custom $GOPATH so all the `go` tool chain works. Makefile also assumes this. If you're used to building your project on a non-standard directory structure, this is not for you.

Thanks to this post talks about how to setup GoLang on SnapCI. It doesn't work anymore out of the box, but does have the fundamental things to get started. Final settings that worked for me

Hope this helps.



Apart from the above, we also need to setup the GOPATH, in order for go to work. On the stage configuration page there is a section called Environment Variables for this stage in there we need to set the name to GOPATH, and the value to /var/snap-ci/.

Tuesday, January 26, 2016

Getting started on Go Lang with Slack

It is a Republic Day today here in India, January 26th and a national holiday. During one of our random morning conversations with Manoj. Hashicorp cropped up into the topic and that quickly escalated to "Go Lang".

Go Lang has been on my radar for quite some time. I wanted to get started with the language. It's my personal thing that lots of new tools and products are being built on Go and I'm still stuck with JVM. After all moving to Devops demands some change in how I operate right? 😃

While getting started with a new programming language, I always write a simple tool using it and open source it. I wrote a Live NSE Stock information fetcher while I was learning NodeJS 4 years ago and a uClassify Scala client while I was getting started with Scala 3 years ago.

On the similar lines, introducing Slack webhooks library in Go. It's similar to it's Java counterpart, it helps you post messages to slack using a Incoming Webhook url.

Github - https://github.com/ashwanthkumar/slack-go-webhook
Usage - https://github.com/ashwanthkumar/slack-go-webhook/blob/master/README.md

This is still my first Go lang code so any kind of feedback will be helpful.

Sunday, October 18, 2015

Chrome Tamil TTS Engine powered by SSN Speech Lab

In an attempt at first good impression that went wrong this was the outcome.

I got to know about SSN's Speech Lab yesterday and built a Chrome extension that builds on top of it as a TTS Engine. These guys have a wonderful system built - You should check out 'em out.


Right click on any Tamil text, right click and listen to it in a Male / Female voice.

You can install the plugin from https://chrome.google.com/webstore/detail/lhalpilfkeekaipkffoocpdfponpojob

The code is available on https://github.com/ashwanthkumar/chrome-tts-tamizh

Details for other extensions using the TTS
- Language - ta-IN
- Gender - male and female
- Voice names - Male is Krishna and Female is Radhae

Monday, October 5, 2015

Introducing scalding-dataflow

For the last 3 days, I've been working on trying to understand the Google Cloud Flow pipeline semantics for batch processing. Result was a ScaldingPipelineRunner for DataFlow pipelines.

NOTICE (You've been warned)
  1. It is still in very very early stages. 
  2. It doesn't have all translators implemented (as of writing)
  3. It isn't tested on a Hadoop Setup yet
It runs WordCount though :) Do give it a spin. 

It goes well with scalaflow - Scala DSL for building dataflow pipelines. 

Special thanks to Cloudera's spark-dataflow project. Couldn't have done without it :)

Friday, September 25, 2015

Winning Amaz-ing Hackathon - Meghadūta

Last weekend was fun at Chennai Amazon office. In association with Venture City, Kindle Team at Chennai organized a hackathon on the theme - "Building Scalable Distributed Systems". The name was theme was good enough to attract me to register for the event :) I went there with Salaikumar and Vijay Kumar to the event under the team name - "Salaikumar".

You can find the problem statements given at the hackathon here.

We won two awards at the event - "Best Voted Award" and "Ultimate Hack Award".

You can find our code at https://github.com/ashwanthkumar/meghaduta.

Few pictures taken during the event

Me doing the presentation of our hack

Salai helping me with the screens

Our prize, a Kindle Paper White each and some certificates :)

Saturday, September 12, 2015

Find that missing Host in Hadoop Cluster

I've started to manage 200 node hadoop clusters recently at work. All these are running on AWS with latest CDH5. We went from separate HDFS + TT model to co-locating TT and DN daemons together.

These machines are all Spot machines backed by a ASG (Auto Scaling Group). If any of them die because of spot prices, they would come back up in a while. So to manage these machines better we attach our own custom generated DNS names to these machines.

Once in a while, the machines that come up doesn't have either a TT or DN daemons running. They would have failed at startup for variety of reasons. The task was to find that missing hosts (generally 1 or 2) of the lot. So I wrote a script that would help us get the missing hosts which don't run one of the process.

Gist - https://gist.github.com/ashwanthkumar/3624a4e69ab26236a746


Monday, August 31, 2015

Kitchen Bento Setup

After struggling couple of days trying to create a new virtualbox with some packages pre-installed for my kitchen tests, I ended up frustrated only to later figure out about mitchellh/vagrant#5492. So, I ended up creating kitchen-bento-setup which would help me create new virtual boxes from the base opscode box.

For reasons unknown to me, I don't get "SSH Authentication Failure" when packaging the VBox using `vagrant package` the first time from base opscode box, any consequtive `vagrant packages` from that pre-built box causes the failure. 

Tuesday, June 2, 2015

ClassNotFound inside a Task on Spark >= 1.3.0

Context - Spark 1.3.0, Custom InputFormat and InputSplit.

Problem - At my work, I had a custom InputSplit definition which had another class A object which then I need to pass it to my Key. I then have a Spark job that reads using my custom InputFormat and things were all fine on Spark 1.2.0. When we upgraded to 1.3.0 things started breaking with the following stack trace.

Caused by: java.lang.ClassNotFoundException: x.y.z.A
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
 at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
 at java.lang.Class.forName0(Native Method)
 at java.lang.Class.forName(Class.java:270)
 at java.io.ObjectInputStream.resolveClass(ObjectInputStream.java:625)
 at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
 at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
 at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
 at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
 at org.apache.spark.util.Utils$.deserialize(Utils.scala:80)
 at org.apache.spark.util.Utils.deserialize(Utils.scala)
 at x.y.z.CustomRecordSplit.readFields(CustomRecordSplit.java:91)

Solution - It took me a while to realize that I've been using Spark's Util object to serialize and deserialize the object (x.y.z.A). The fix was very simple

objA = Utils.deserialize(buffer, Utils.getContextOrSparkClassLoader());

Looks like in the earlier versions the __app__.jar is getting added as part of the Executor and Task classloader but not in the latest versions. When I passed the Context ClassLoader to the deserialization it worked perfectly fine. 

Lessons
- Don't use Spark's Util method. Even though the Util is a private[spark] object since I'm accessing it from a Java class the scala package access protection doesn't seem to apply. I never knew that until now. 
- Always use Utils.getContextOrSparkClassLoader() when doing Java Deserialization in Spark.

Wednesday, May 20, 2015

Parser Combinator

After fiddling with RegexParser for a while in Scala, I realized how much I missed learning Automata properly at college. I was migrating a 140-line REGEX from MySQL to Scala at my work and learned a lot of new things in the process. It was during one of those times one my mentors - yellowflash, helped me understand some of the forgotten concepts about left factoring, recursive grammars, etc.

In the process he were discussing about how would I go about writing RegexParser if I had to write it by hand myself? The exercise was to help me understand how the Parser Combinator works which would help me write better grammar. We did some scribbling on the paper and decided to implement it in Scala. You can find it on https://github.com/ashwanthkumar/parser-combinator. It is just a start, still a long way to go. Looking forward to it, it should be fun.

Friday, May 15, 2015

[IDEA] Autoscaling in Hadoop

Everybody today uses Hadoop + some more of its ecosystem tools. Let it be Hive / HBase etc. I have been using Hadoop for writing production code from 2012 and for experiments much earlier than that. One thing that I don't see it anywhere is the ability to autoscale the Hadoop cluster elastically. Unlike scaling web servers having all map and reduce tasks full doesn't necessarily translate to CPU / IO metrics spiking on the machines. 

Figure - Usage graph observed from a production cluster
hand drawn on white-board

On this front - Only Qubole guys have seem to have done some decent work. You should check out their platform if you haven't. It is really super cool. A lot of inspiration for this post have been from using them.

This is one of my hobby project attempt at building just the Autoscaling feature for Hadoop1 clusters if had been part of say Qubole team back in 2012.

In this blog post I talk about the implementation goals and/or hows building this as part of InMobi's HackDay 2015 (if I get through the selection) or go ahead and build it anyways on that weekend.

For every cluster you would need the following configuration settings

  • minNodes - Minimum # of TTs you would always want in the cluster.
  • maxNodes - Maximum # of TTs that your cluster would like to use at any point in time.
  • checkInterval - Time unit in seconds to check the cluster for compute demand (default - 60)
  • hustlePeriod - Time unit in seconds to monitor the demand before we go ahead with upscaling / downscaling the cluster. - (default - 600)
  • upscaleBurstRate - Rate at which you want to upscale based on the demand (default- 100%)
  • downscaleBurstRate - Rate at which you want to downscale (default - 25%)
  • mapsPerNode - # of map slots per TT  (default - based on the machine type)
  • reducersPerNode - # of reduce slots per TT (default - based on the machine type)
Assumptions
- All the nodes in the cluster are of same type and imageId - easier during upscaling / downscaling.
- All TTs will have Datanodes also along with it

Broad Goals
- Less / No manual intervention at all - We're talking about hands-free scaling and not one click scaling.
- Should have less / no changes in the framework - If we start making forks of Hadoop1 / Hadoop2 to support certain features for autoscaling then most likely we'll have a version lock which is not a pretty thing 1-2 years down the lane. 
- Should be configurable - For users willing to dive deeper for configuring their autoscaling they should have options to do that. Roughly translates to being all blue configurations having sensible defaults.  

Larger vision is to see if we can make the entire thing modular enough to support any type of scaling. 

Please do share your thoughts if you have any on the subject. 

Sunday, March 15, 2015

Winning #GoPluginChallenge

Winning the #GoPluginChallenge was a very big deal for me. Why? Being acknowledged by a team that rejected me in an interview 3 years ago gives at most satisfaction of some kind of achievement in life. All thanks and credits goes to my mentors - Rajesh and Sriram. Special thanks to Manoj who gave me all the motivation for building the Github PR plugin. It is also really nice to know both the plugins that I submitted have won together.