Tigor Sinuraja personal website

  • Increase font size
  • Default font size
  • Decrease font size
Home Machine Learning

_Machine Learning

E-mail Print PDF

Update 202209: I am working on a system that predicts a stock price and buys and sells automatically. Its all Databricks notebooks and Delta. I am inventing the wheel for sure. But I have fun and I am too lazy to research properly where I should invest. So I let the Cluster do it for me :-).

MMA Classifier is switched off because Betfair. So the stock market is the new game

Update 202208:  After using a lot of tooling on-prem and in the Cloud I was wondering what would I actually would use if I had a choice while building a new Analytic or Big Data solution. The answer is: Databricks, PySpark and Delta Lake on Azure. It gives you all the control where you want it and the elasticity of compute resources. I've got a feeling that although Low Code tools like Informatica have a permanent place in Data Engineering world, Python and PySpark will gain more market share.

I have moved my MMA classifier to Databricks and Delta and the transition was smooth. It all worked in Azure Cloud within 1 week. Unfortunatelly Betfaiir is not available in NL anymore, so I can't use API to automatically place bets. But this was never for financial gain for me. Rather it was mostly out of healthy curiosity. I wanted to see the whole system work and I have seen it working!

Just like so many data fans before me I have been looking at Machine Learning and how one can apply it. This is a presentation I made about a model I have created back in 2018 when I worked at i2i, a great Dutch company in Amsterdam.  The inspiration came from a good colleague of mine, Thierry Hennekes when we were working together at UMCU. We tried to predict the results of an MMA fight based on the stats of the fighters. I have learned Python, Webscraping, Rest API and Machine Learning in the proces. The goal was to let ML make me rich :-). That didn't happen just yet, but I had a lot of fun trying. Some of the code is in my GiHub repo . For obvious reasons its not the last working version.



Update 202111:  I began to move the system to Azure Cloud. This version leverages the power of Azure (automated) ML. I didn't use webscraping this time and downloaded a UFC dataset from Kaggle. I have trained a model and deployed it as a webservice. I have tested the endpoint from a Python script and it came back with a response. No code ML and its working!




My experience with Machine Learning also includes a project I did at Dutch Railways where I have used the Java ML Spark library. The goal was to predict the total number of train passengers while leveraging the power of Hadoop cluster for the memory and CPU intensive calculations

Last Updated on Sunday, 18 September 2022 09:56  


Which tool is the best?


Members : 100
Content : 63
Web Links : 6
Content View Hits : 54501

Who's Online

We have 1 guest online

Interessante links

Featured Links:
Advies- en onderzoeksbureau voor de zorgsector
de Nederlandse Zorgautoriteit
BI future blog
Blog van Hennie de Nooijer