I'm a Toronto based Lead AI/Data Engineer@RBC with a background in LLM's, Data Engineering and Statistics.🤓 . I have extensive experience designing end-to-end solutions from Algorithms, ETL pipelines and Big Datasets to optimize workflows and big data challenges in the financial space.🤖
Lead AI & Data Engineer
Focused on orchestrating big data pipelines on Snowflake,Hadoop and building Gen AI/ ML usecases for GLobal Wealth Customers & Advisors.
Leverage wealth management data to execute ETL and ELT processes on Snowflake. Building GenAI solutions using Hugging Face, OpenAI models deploying on AWS & On-Premises, performing statistical analyses, fine-tuning & RLHF techiques. Learning and reading research papers to enhance the performance of LLMs. Making production ready applications.
Senior Data Engineer
Focused on processing big datasets and data analytics support to teams - Global Business Payments, Credit Risk in Carribean, Peru, Canada & US.
I am responsible for the improvement of SQL and No-Sql queries. Moreover, Analyzing the big data-sets(~100GB per day) and detect fraud transactions. Apart from that, Created various automation tools using python, airflow, JS and reduced 4000 manual hours.
Data Scientist
Analyzed datasets to catch Fraud transcations. Supported AML/ATF, Global Business Market, Global Wealth & Cyber security teams.
Responsible for parsing SWIFT transactions for AML/ATF purposes using Python and R. Moreover, focused heavily on source to target data validation, & completeness testing. Focused on creating models using scikit-learn and nltk. Created and automated ~25 production-ready power BI dashboards
Data Solution Partner
Assisting new or existing Hydra Customers with data warehouse implementation, analytics, and reporting services.
Responsible for ETL to a Hydra data warehouse by linking open source services like airflow, data build tool, airbyte and many-more to completely automate the pipelines for retention metrics, visualizations etc. Moreover, responsible for the data migration from public clouds to hydra. Also, provide assistance with query optimizations.
Full-Stack Developer
Responsible for deploying automated sub-domains on the internet using Docker and AWS in the Start-up to support small businesses.
My key responsibilty was to use Route53 service on AWS and automated the deployment of MERN stack. Moreover, I learned designing docker files for scalability and for high performance. Reduced Production website loading time by 100ms using Gzip compression and Cloudflare
Web Developer
Responsible for designing, coding and maintaining websites and databases for customers to manage vehicle bookings.
Here, I deep expertise and hand on experience with Web Applications and programming languages such as HTML, CSS, JavaScript, PHP, JQuery and API's. Furthermore, I am responsible for managing RESTful Services and develop knowledge of Search Engine Optimization(SEO).
Developer
Responsible for managing Azure cloud VM machines and web-server - Nginx to provide web traffic data to the Elasticsearch index's.
Managed production-level Nginx web server. Moreover, resolved production level Ubuntu server's OS to resolve application port issues using port-forwarding and proxies. Managing the elasticsearch index's and analyzed the data using Kibana
I was born and raised in India. After completing my high school, I moved to Toronto, Canada to pursue my Bachelors, and have lived here since then.
Outside of work, I enjoy learning about LLM's research papers, configuring and networking servers and deploying services or applications such as Grafana, Hadoop, NodeJS, and more. I like to learn technical or non-technical skills during my free time. I can also solve a Rubik's cube in under a minute and on a path of becoming a better thinker.
Programming Languages: Python, SQL, Git
Most Used Tools: Hugging Face LLM Models (esp - Meta 3.1 8B-70B, Deepseek Models), OpenAI Model API's (GPT 4o, GPT 3.5), AWS Bedrock, AWS Sagemaker Model Deployments( w/ Airflow) for LLMs inferencing in on-premise cloud , AWS, Azure, GitHub Actions for CI/CD, Snowflake, DBT
Introducing a Streamlit Cloud-deployed ChatGPT-like chatbot powered by OpenAI's GPT-4.0 Turbo and GPT-3.5 Turbo models. Hosted on a private server and exposed via Cloudflare Tunnel.
OCR Project using OpenCV, Tesseract and EasyOCR to process Retailer Receipts implemented with FastAPI, WSGI (Gunicorn) & Docker hosted in Cloudflare.
NodeJs Real Time Web Application built with Socket.io and Express, which is powered by Docker.
Matrix is an audiobook app interface which is build with react framework embedded with material UI design and react bootstrap, and contains npm packages.
validates the skills and knowledge to apply Azure’s machine learning techniques to train, evaluate, and deploy models that solve business problems.
Authenticate Badge"Best of the Best Award" - Top Individual Performance
Pdf ViewHave a solid knowledge of data processing languages, such as SQL, Python, or Scala, and able to understand parallel processing and data architecture patterns.
Authenticate BadgeHave hands-on experience completing day-to-day operational tasks needed to use MongoDB.
Authenticate Badgecan demonstrate the fundamental skills needed to effectively create, manage and monitor DAGs on Apache Airflow.
Authenticate BadgeCompleted workshop provided by Microsoft at Scotiabank for Business Analytics using Power BI.
Earned Software Testing with Visual Studio Certification for managing test executions.
Achieved Comp TIA Linux+ System Architecture Certification.
Qualified and Achieved State-Level Rank all over India, held by 'NGSF'.
Earned rank of 5,636 globally in Mathematics Problem Solving Ability Test.
Send your message below or email me at [email protected] to get immediate response.