Projects

Eurocovid

Eurocovid is an interactive map displaying the current number of COVID-19 cases per 100,000 people as measured in the past 14 days. The data is automatically sourced from ECDC and used to update the chart whenever new data is available. The frequency for certain regions is daily, but for others is weekly. Hovering on each region displays the region name, number of normalized cases, and the data update date. The code is open source and is purely written in python + dash for visualization. The regional boundaries map is based on the geoJson files from Eurostat. The map is currently offline, but it can be easily deployed locally.

Quant-arxiv sanity preserver

Quant-arxiv sanity preserver is an academic paper management system aimed at simplifying and enhancing the job of being up to date in the quantitative finance literature, without having to scroll through tens of publications every day or decipher the ugly arXiv email digest which provides no indication of the relevance of each new paper. The platform allows to save, discuss and filter papers from arXiv /q-fin by showing the most popular papers every day and recommending new ones based on the papers saved in the library by the user. The project is open source, the code is available on GitHub, and it is an adaptation to the world of quant finance of the already popular arxiv-sanity-preserver by Andrej Karpathy. I am aiming to expand the source of papers beyond arXiv and add an email digest with customized frequency to provide updates only on relevant papers and reduce the clutter. Behind the scenes, TFIDF vectors are created to cluster papers and SVMs are trained for all users. The platform is currently offline, due to high hosting costs, but the code is free and open-source on GitHub.

Library of Words

The Library of Words is a digital collection of pages filled with every possible combination of 320 words in the English language. The library starts with a page containing the single first word “a” and finishes with a page containing the last word “zyzzyvas”, repeated 320 times. The dictionary used in the library contains 443437 words from the English language. This means that every book, thought, love story, news tragedy, war, biography, scientific discovery or truth about the universe which has ever been written with those English words, or is yet to be written, is already present in this library. The concept of the Library of Words is based on the short story La biblioteca de Babel by Argentinian author and librarian Jorge Luis Borges. In the book, the library consisted of repeated adjacent hexagonal rooms with shelves on four walls, containing books filled with every possible combination of 29 characters (26 letters plus period, comma and space). In 2015, Jonathan Basile created a digital version of the library using a base-29 conversion system and a pseudo-random number generator to link a hypothetical book location with the text in the book. In the Library of Words I revisited Borges’ idea and used a similar system to Basile’s algorithm to produce its pages.

ShingleBot

ShingleBot is more humorous side project I came up with while playing with the Library of Words. Knowing that human language follows a simple mathematical form – known as Zipf’s law – I was wondering how hard it would be to create semi-intelligible sentences by just using the Zipf’s distribution and few grammatical rules. The result is a generator of n-grams of words – aka shingles – which can lead to funny sentences. This is by far not the best way to achieve natural language. There are better and far more realistic methods, such as Markov chain text generators, but it was fun to build and it was surprising to achieve decent-enough results with such a simple algorithm.

Bad Scientist

I used to maintain a blog writing spurious posts about science and technology. I stopped writing on this blog, but you can have a look at the archive here.

Grep Linux

I also used to maintain a blog documenting my journey through self-learning how to operate various distributions of Linux, sharing various tips and tricks in BASH and beyond. I do not maintain the blog any longer, but you can look through the archive here.