Welcome to my personal website where I showcase my work and skills.
Yash is a Software Development Engineer with strong experience in Machine Learning, Deep Learning, NLP, Computer Vision, Backend Engineering, Cloud Integrations, and Automation. He enjoys building intelligent, scalable, and high-impact systems.
At Onemind Services, I’ve engineered solutions such as Azure Entra SSO integration, NetBox automation workflows, Aruba Central provisioning, ML-driven log analytics using Elasticsearch, DiffSync-based synchronization, and CI/CD automation via GitHub Actions. My work bridges ML + Software Engineering + Cloud.
I’ve also built end-to-end ML projects including CNN-based image classification from scratch and transformer-based NLP models for content moderation, along with contributing to open-source NetBox development. Outside of work, I enjoy exploring new technologies and deepening my AI/ML expertise.
Bachelor of Computer Applications (Computer Science, Applied Statistics),
Bharat
Institute of Technology, Meerut
Software Development Engineer at Onemind Services LLC
Gurugram, Haryana
English, Hindi
Feel free to reach out to me for any questions or opportunities!
yashpal86300@gmail.com
Gurugram, Haryana
# Advanced Machine Learning Pipeline with Scikit-learn
import numpy as np
import joblib
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.25, random_state=42, stratify=y
)
# Build pipeline (scaling + model)
pipeline = Pipeline([
("scaler", StandardScaler()),
("clf", RandomForestClassifier(random_state=42))
])
# Hyperparameter tuning
param_grid = {
"clf__n_estimators": [100, 200, 300],
"clf__max_depth": [None, 5, 10, 20],
"clf__min_samples_split": [2, 5, 10],
}
grid = GridSearchCV(
estimator=pipeline,
param_grid=param_grid,
cv=5,
n_jobs=-1,
scoring="accuracy",
)
# Train optimized model
grid.fit(X_train, y_train)
# Best model from GridSearch
best_model = grid.best_estimator_
print("Best Parameters:", grid.best_params_)
# Prediction
y_pred = best_model.predict(X_test)
# Evaluation
accuracy = accuracy_score(y_test, y_pred)
print(f"\nTest Accuracy: {accuracy:.4f}\n")
print(classification_report(y_test, y_pred))
# Feature importance (from RandomForest)
rf = best_model.named_steps["clf"]
importances = rf.feature_importances_
print("Top 5 Important Features:")
top_idx = np.argsort(importances)[-5:][::-1]
for idx in top_idx:
print(f"{data.feature_names[idx]}: {importances[idx]:.4f}")
# Save model
joblib.dump(best_model, "optimized_random_forest.pkl")
print("\nModel saved as optimized_random_forest.pkl")