Profile cover photo
Profile photo
Patrick Nicolas
6 followers
6 followers
About
Posts

Post has attachment
Reinforcement learning in Scala
You may wonder how robots, autonomous systems or a software game player learn. The answer lies in a field of AI known as reinforcement learning. For example, a robot navigating a maze plans his next move according to its current location and previous moves....
Add a comment...

Post has attachment
Spark ML pipelines I - Features encoding
Apache spark introduced machine learning (ML) pipeline in version 1.4.0. A pipeline is actually a workflow or sequence of tasks that cleanse, filter, train, classify, predict and validate data set. Those tasks are defined as stage of the pipeline. Spark 2.0...
Add a comment...

Post has attachment
Monte Carlo Integration in Scala
This post introduces an overlooked numerical integration method leveraging the ubiquituous Monte Carlo simulation. Not every function has a closed form for computing a definite integral known as symbolic integration. There are many numerical integration met...
Add a comment...

Post has attachment
Extending Apache Spark/MLib with AdaGrad
The stochastic gradient descent (SGD) is a critical element in the training of machine learning models such as support vector machines, logistic regression or back-propagation neural networks. In its simplest incarnation, the gradient is computed using a si...
Add a comment...

Post has attachment
Fibonacci Recursive Implementation Galore
This post evaluates the performance of the tail recursion in Scala relative to alternative solutions as applied to the Fibonacci formula Overview There are many ways to skin a cat and implement the Fibonacci recursion. This post illustrates the power of ta...
Add a comment...

Post has attachment
Bootstrapping by resampling with replacement
Bootstrapping is a statistical resampling method that consists of randomly sampling a dataset with replacement. This technique enables data scientists to estimate the sampling distribution of a wide variety of probability distribution Background One key obj...
Add a comment...

Post has attachment
Normalized Discounted Cumulative Gain in Scala
Overview Numerous real-life applications of machine learning require the prediction the most relevant ranking of items to optimize an outcome. For instance Evaluate and prioritize counter-measures to cyber-attach Ranks symptoms in a clinical trial Extract d...
Add a comment...

Post has attachment
Weighting logistic loss for imbalanced dataset in Spark
Overview Some applications such as spam or online targeting have an imbalanced dataset. The number of observations associated to one label is very small (minority class) compared to the number of observations associated to the other labels. Let's consider t...
Add a comment...

Post has attachment
Managing Spark context in ScalaTest
Overview Debugging Apache Spark application using ScalaTest seems quite simple when dealing with a single test: Specify your Spark context configuration SparkConf Create the Spark context Add test code related to your application Clean up resources (Spark c...
Add a comment...

Post has attachment
Weighting logistic loss for imbalanced dataset in Spark
Overview Some applications such as spam or online targeting have an imbalanced dataset. The number of observations associated to one label is very small (minority class) compared to the number of observations associated to the other labels. For instance, th...
Add a comment...
Wait while more posts are being loaded