Dump docs

“Personal Blog / Scribbles”

Robust MlOps infrastructure

Building a Robust Machine Learning Operations Infrastructure Creating a reliable and flexible MLOps infrastructure is critical for organizations leveraging machine learning in production. In this post, I’ll explore practical challenges that emerge when deploying ML systems and propose pragmatic solutions using open-source tools and standard workflows. The Challenge of Production ML Systems After deploying a machine learning model in production, a new set of challenges emerges. The model pipeline typically involves data storage (like S3), orchestration tools (such as Airflow, Prefect, or Flyte), and back to storage for processed results. But this simple flow becomes complex when we consider the full ML lifecycle. ...

Top 50 SQL for Mastery

Recyclable and Low Fat Products 1 2 3 def find_products(products: pd.DataFrame) -> pd.DataFrame: # return products[(products["low_fats"] == "Y") & (products["recyclable"] =="Y")].loc[:,["product_id"]] return products[(products["low_fats"] == "Y") & (products["recyclable"] =="Y")][["product_id"]] 1 select product_id from Products where low_fats = "Y" and recyclable = "Y" Note: can use df.iloc[:, [“column_name”]] or df[[“column_name]] df[“column_name] returns series. Find Customer Refree 1 2 def find_customer_referee(customer: pd.DataFrame) -> pd.DataFrame: return customer[(customer["referee_id"] != 2) | (customer["referee_id"].isnull())][["name"]] or 1 2 3 def find_customer_referee(customer: pd.DataFrame) -> pd.DataFrame: customer.fillna(0, inplace=True) return customer.loc[customer["referee_id"] !=2 , ["name"]] Note: can insert the logic in the loc 1 select name from Customer where referee_id != 2 or referee_id is Null Big countries 1 2 def big_countries(world: pd.DataFrame) -> pd.DataFrame: return world[(world["area"] >= 3000000) | (world["population"] > 25000000)][["name", "population", "area"]] 1 select name, population, area from world where population >= 25000000 or area >= 3000000 Article View 1 2 3 def article_views(views: pd.DataFrame) -> pd.DataFrame: return views[views["viewer_id"] == views["author_id"]][["author_id"]].drop_duplicates() .sort_values("author_id").rename(columns={"author_id": "id"}) 1 select distinct(author_id) as id from Views where author_id = viewer_id order by id invalid tweets 1 select tweet_id from Tweets where length(content) >15 1 2 def invalid_tweets(tweets: pd.DataFrame) -> pd.DataFrame:22 return tweets.query(f"content.str.len() >15")[["tweet_id"]] or ...

Machine learning Box

Imagine a box where you put all of your machine learning stuff, Here it is. [WIP] will update the structure Bias vs Varience Metrics Precision Recall Accuracy F1-score Cross-Validation How do you choose which cross validation technique will be used for your project. THink about how your model will be sued and interact with the data in a deployed setting. if the dataset is huge, use Hold-out, which is basically 80-20 method ...

Setting up Nvidia SDK Manager and Torch Library in Jetson Board

The blog serves as a backup for setting up the Nvidia Jetson Orion AGX for Development/Production. Please note: review all the steps before proceeding. I was updating from JetPack 5.x to 6.x. My Jetson Orion AGX had already been flashed. Installation The blog covers installing the Nvidia sdk manager. Make sure to install Ubuntu 22.04, not 24.04, for installing JetPack 6 (as of August 19, 2024). Similarly, upgrade the host machine, install all the necessary Nvidia drivers and CUDA on the host machine before installing the Nvidia SDK Manager. More information on installing Nvidia Drivers on ubuntu ...

Installing Cuda 12.x in Ubuntu 24.04

MOTIVE So, Ubuntu 24.04 LTS was released on 25 April 2024. Its been more 3 months from the release and I thought it would be safe enough to install it in my main working computer. Wrong!! I wanted a documentation for setting up the CUDA 12.2 in Ubuntu 24.04 but ended for reinstalling everything. Thus, here is the blog for future me to setup everything. Thanks to all the great help I found in the Internet [ referenced as links ]. ...