Skip to content

Commit

Permalink
deploy: 3904d53
Browse files Browse the repository at this point in the history
  • Loading branch information
github-actions[bot] committed Feb 16, 2024
1 parent a3c5d2a commit 18b2c12
Show file tree
Hide file tree
Showing 13 changed files with 13 additions and 13 deletions.
2 changes: 1 addition & 1 deletion blog/1-5-million-pdfs-in-25-minutes/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,4 @@
}
</code></pre></div><p>Beyond utilizing system jobs, we also use the constraints stanza for further control over worker placement. Through our Terraform module, we pre-populate EC2 tags within the Nomad client configuration, making these tags accessible within Nomad&rsquo;s client meta. These tags then serve as the basis for defining constraints, enabling dynamic job scheduling and node assignment based on specific attributes.</p><p><img src=/static/images/cnotes_wf_3.png alt="nomad ui"></p><p>Using the <code>nomad_client</code> EC2 tag, we determine the role of each client and deploy the corresponding program, which for the most part, are our worker programs written in Go. In the example above, you can see separate ASGs for signer and pdf-creator tasks. This enables Nomad to ensure they run on distinct sets of nodes for optimal resource utilization.</p><p>PDF generation requires significantly more resources than signing tasks, so we use separate ASGs for these processes to scale them independently of other jobs.</p><p>Once jobs are initiated, we additionally stream the job statuses to the RunDeck UI from the Redis instance that maintains the global state of all distributed jobs, in case an admin wants to peek.</p><p><img src=/static/images/cnotes_wf_4.png alt="rundeck job status ui"></p><p>The Rundeck control server runs a Python script to extract job status data from Redis:</p><div class=highlight><pre style=color:#93a1a1;background-color:#002b36;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-py data-lang=py>all_cnotes_str <span style=color:#719e07>=</span> r<span style=color:#719e07>.</span>get( f<span style=color:#2aa198>&#34;batch-workflows:cnote:all:{posting_date}:{exchange}:{company}:{trade_process_type}&#34;</span>
)
</code></pre></div><p>For the contract note generation job, we spawn about 40 instances in total currently, a mix of <code>c6a.8xlarge</code>, <code>c6a.2xlarge</code>, and <code>c6a.4xlarge</code>.</p><h3 id=post-execution-teardown>Post-execution teardown</h3><p>Upon the completion of all queued jobs—in this example, the computation, generation, signing, and e-mailing of 1.5+ million PDFs—we initiate the teardown process. A program that monitors the successful job count in the Redis state store executes this by simply invoking a terraform teardown. This involves resetting the ASG counts to zero, draining existing nodes, halting Nomad jobs, and shutting down the Nomad server itself.</p><p>In this specific example, the entire operation, end-to-end, finishes in 25 minutes. The cost incurred is unsurprisingly, negligible.</p><h2 id=e-mailing-pdfs-at-high-throughput>E-mailing PDFs at high-throughput</h2><p>As PDF signing workers sign PDFs and hand them over, they are instantly queued for being e-mailed by the e-mailing workers. We use a self-hosted auto-scaling <a href=https://haraka.github.io/>Haraka SMTP</a> server cluster and maintain concurrent connection pools from the e-email workers with the <a href=https://github.com/knadh/smtppool>smtppool</a> library we wrote to push out e-mails at high throughput.</p><p>We transitioned from Postal that we used for many years to Haraka for significant performance benefits—a change that merits its own post separately. Postal’s resource usage was intense and it was not horizontally scalable, and unfortunately grew into an unfixable bottleneck. However, with our move to Haraka which can be easily horizontally scaled, we are no longer limited in capacity pushing out e-mails from our SMTP cluster to target mail servers over the internet. It is important to note that IP reputation matters when self-hosting SMTP servers and we have grown and maintained this over almost a decade—mainly by never sending marketing e-mails and definitely not spam!.</p><h2 id=conclusion>Conclusion</h2><p>So, that’s it. We are very pleased by the throughput that we have achieved with the new architecture, primarily with the breakthroughs that are Typst and Haraka, and all the orchestration headaches that Nomad trivially handles for us. This is not one big software system, but more of a conceptual collection of small scripts and programs that orchestrate workflows for our specific bulk jobs. We are also happy that this also resulted in the creation and open sourcing of multiple projects.</p><p>We plan on moving all our time consuming bulk jobs—of which there are plenty in our industry—to follow the same framework. In addition, we are in the process of adding metrics to the workers so that we can track the global workflow states in realtime and adjust resources as necessary.</p><p>As the architecture is simple and rooted in common sense practices conceptually, the fundamental building blocks can be swapped out entirely if required (RunDeck, Nomad, S3 etc.). We are confident that this will do its job and scale well for a long time, and there is room for further optimisation. Needless to say, spawning a few large barebones instances, running highly concurrent business logic saturating all cores throughout and then winding down as quickly as possible, the cost incurred is negligible, and any potential resource wastage is minimal, if not nil.</p><div class=tags><ul class=flat><li><a href=/%20/tags/golang>golang</a></li><li><a href=/%20/tags/devops>devops</a></li></ul></div></div></div></div></div></section><footer class=footer><a href=https://zerodha.com>Zerodha</a> &copy; 2024. All rights reserved.</footer><script type=application/javascript>var doNotTrack=false;if(!doNotTrack){window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;ga('create','UA-75485182-2','auto');ga('send','pageview');}</script><script async src=https://www.google-analytics.com/analytics.js></script></body></html>
</code></pre></div><p>For the contract note generation job, we spawn about 40 instances in total currently, a mix of <code>c6a.8xlarge</code>, <code>c6a.2xlarge</code>, and <code>c6a.4xlarge</code>.</p><h3 id=post-execution-teardown>Post-execution teardown</h3><p>Upon the completion of all queued jobs—in this example, the computation, generation, signing, and e-mailing of 1.5+ million PDFs—we initiate the teardown process. A program that monitors the successful job count in the Redis state store executes this by simply invoking a terraform teardown. This involves resetting the ASG counts to zero, draining existing nodes, halting Nomad jobs, and shutting down the Nomad server itself.</p><p>In this specific example, the entire operation, end-to-end, finishes in 25 minutes. The cost incurred is unsurprisingly, negligible.</p><h2 id=e-mailing-pdfs-at-high-throughput>E-mailing PDFs at high-throughput</h2><p>As PDF signing workers sign PDFs and hand them over, they are instantly queued for being e-mailed by the e-mailing workers. We use a self-hosted auto-scaling <a href=https://haraka.github.io/>Haraka SMTP</a> server cluster and maintain concurrent connection pools from the e-email workers with the <a href=https://github.com/knadh/smtppool>smtppool</a> library we wrote to push out e-mails at high throughput.</p><p>We transitioned from Postal that we used for many years to Haraka for significant performance benefits—a change that merits its own post separately. Postal’s resource usage was intense and it was not horizontally scalable, and unfortunately grew into an unfixable bottleneck. However, with our move to Haraka which can be easily horizontally scaled, we are no longer limited in capacity pushing out e-mails from our SMTP cluster to target mail servers over the internet. It is important to note that IP reputation matters when self-hosting SMTP servers and we have grown and maintained this over almost a decade—mainly by never sending marketing e-mails and definitely not spam!.</p><h2 id=conclusion>Conclusion</h2><p>So, that’s it. We are very pleased by the throughput that we have achieved with the new architecture, primarily with the breakthroughs that are Typst and Haraka, and all the orchestration headaches that Nomad trivially handles for us. This is not one big software system, but more of a conceptual collection of small scripts and programs that orchestrate workflows for our specific bulk jobs. We are also happy that this also resulted in the creation and open sourcing of multiple projects.</p><p>We plan on moving all our time consuming bulk jobs—of which there are plenty in our industry—to follow the same framework. In addition, we are in the process of adding metrics to the workers so that we can track the global workflow states in realtime and adjust resources as necessary.</p><p>As the architecture is simple and rooted in common sense practices conceptually, the fundamental building blocks can be swapped out entirely if required (RunDeck, Nomad, S3 etc.). We are confident that this will do its job and scale well for a long time, and there is room for further optimisation. Needless to say, spawning a few large barebones instances, running highly concurrent business logic saturating all cores throughout and then winding down as quickly as possible, the cost incurred is negligible, and any potential resource wastage is minimal, if not nil.</p><div class=tags><ul class=flat><li><a href=/tags/golang>golang</a></li><li><a href=/tags/devops>devops</a></li></ul></div></div></div></div></div></section><footer class=footer><a href=https://zerodha.com>Zerodha</a> &copy; 2024. All rights reserved.</footer><script type=application/javascript>var doNotTrack=false;if(!doNotTrack){window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;ga('create','UA-75485182-2','auto');ga('send','pageview');}</script><script async src=https://www.google-analytics.com/analytics.js></script></body></html>
2 changes: 1 addition & 1 deletion blog/a-lesson-in-niche-business-dsls-at-scale/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -322,4 +322,4 @@
implement at India&rsquo;s largest stock broker. Writing a framework that abstracts
the annoying caveats of an under-loved feature like Go plugins, to enable
dynamic loading of Go functions to act as rules, is useful, considering the
tangible benefits it presents when compared to writing custom DSLs.</p><div class=tags><ul class=flat><li><a href=/%20/tags/golang>golang</a></li></ul></div></div></div></div></div></section><footer class=footer><a href=https://zerodha.com>Zerodha</a> &copy; 2024. All rights reserved.</footer><script type=application/javascript>var doNotTrack=false;if(!doNotTrack){window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;ga('create','UA-75485182-2','auto');ga('send','pageview');}</script><script async src=https://www.google-analytics.com/analytics.js></script></body></html>
tangible benefits it presents when compared to writing custom DSLs.</p><div class=tags><ul class=flat><li><a href=/tags/golang>golang</a></li></ul></div></div></div></div></div></section><footer class=footer><a href=https://zerodha.com>Zerodha</a> &copy; 2024. All rights reserved.</footer><script type=application/javascript>var doNotTrack=false;if(!doNotTrack){window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date;ga('create','UA-75485182-2','auto');ga('send','pageview');}</script><script async src=https://www.google-analytics.com/analytics.js></script></body></html>

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion blog/being-future-ready-with-common-sense/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion blog/from-native-to-react-native-to-flutter/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion blog/hello-world/index.html

Large diffs are not rendered by default.

Loading

0 comments on commit 18b2c12

Please sign in to comment.