Chaotic Thoughts

The Impact of Mental Fatigue, Task Monotony, and Data Skewness on Data Annotator Performance

April 11, 2025

Intro

There are two topic that people rarely discuss when in comes to data annotation. First, that annotators are not just things that needs to be trained, and whose performance needs to be closely monitored but regular human beings like any one of ourselves as people who can get tried, or distracted. And second, how the data we push through those people influences their annotators’ performance.

As I feel compassion, and deep respect for annotators I work with on daily basis I wanted to cover that topic in one of my writing but never had time to do so as well as bandwidth to conduct a proper study. With help from, AI and all researches who has done the ground work, I can at least share a short summary.

Review: The Geek Way by Andrew McAfee

December 9, 2024

The book focuses and explores in depth the key traits of successful companies like Amazon, Apple, Microsoft etc. It differentiates four main areas: ownership, openness, speed, and science. From all four openness is mentioned as the key to all other features as it naturally enables them.

For myself, I summarize the book as two aspects: environment, and speed.

The company needs to focus on creation of an environment where ideas can move freely, basic needs like access to code (including changing the code), data, documentation, support easily available. In addition to that negative feedback not only welcomed but actively sought so any issues can be discovered as soon as possible so they do not become too large of a problem. This requires an atmosphere where people are not punished for mistakes and so there is no implicit incentive for covers up.

Bright Cocaine: colors and dopamine

December 5, 2024

Intro

As continuation of my attempts to regain control over my attention, two month ago I switched my phone to first black, and white mode, and then I applied the same filter at 30% on my laptops and then reduced the filter intensity to 20-30% on all my devices. I helped a bit, actually black, and white helps quite significantly but it is a bit tiering to use.

Python Pipes

November 29, 2024

I’ve always wanted to have a way to build data processing pipelines in Python using pipes, like this range(10) | F(is_odd) | P(lambda x: x * 2), instead of functions and generators and maps and loops. So I’ve tried …

The idea is pretty simple: let’s create a class with implemented OR and ROR operators, the pipes.

    def __or__(self, other):
        other.source = self
        return other

    def __ror__(self, other):
        self.source = (
            iter(other)
            if not isinstance(other, (str, bytes)) and hasattr(other, "__iter__")
            else other
        )
        return self

The tricky part was implementation of __next__ since I wanted it to be a lazy operation. After a few trials and errors I’ve ended up with a pretty simple approach where the wrapping class implementing the pipe will call next to its source, added by OR or ROR, apply a transformation and then return the result of the transformation.

Experimental Power Demo: Frequentist vs. Bayesian Power Visualization

October 31, 2024

a simple visualization of the difference between frequentist and Bayesian power and how effect size, noise (standard deviation), and sample size affect the results.

All Things You Hear

October 16, 2024

I like the idea I got from one of Joscha Bach posts that everything we hear is automatically executed by our brain like it is own thoughts even when we don’t pay attention to it. And that it is a security vulnerability that it opens a gate for all sorts of exploitations. I keep returning to this thought frequently and so I decide to put it on “paper”.

The summary of my thoughts on the topic. We have that loop hole that can be exploit in multiple ways from little verbal abuse, or manipulation to full-fledged propaganda attacks. The most devious part is that it can crawl under your skin even if you don’t even pay attention to it. It is probably even more contagious when you are not pain attention since your guard is off. You can resist certain ideas when you hear then once, twice, ten times but after a hundred of repetitions they will eventually get through unless there is something in you that will make those ideas completely unacceptable for you.

Recommendation System on HNSW and Exponential Moving Averages

October 11, 2024

Intro

I was reading the original paper on “Hierarchical Navigable Small Worlds (HNSW)” https://arxiv.org/abs/1603.09320 which I found much easier to understand than all those YouTube videos I tried to watch and articles to read. HNSW is a probabilistic data structure for searching neighbors in multi-dimensional space. One of practical applications is search of semantically close objects. Reading that paper and some other activities made me curious if I can quickly implement a recommendation system which combines three things: HNSW, moving averages, and randomness.

Moving Averages

October 9, 2024

I was curious about use of averaged vector embedding for recommendation purposes, and then I started wondering if instead of averaging I should try other metrics like median or top percentiles to focus on more frequent scenarios and reduce the influence of outliers.

And then the question was: imaging that you want to use it in production, how can you compute averaged embedding for millions of users ideally with instant updates and without offline data processing in bulk.

Chaotic Good

October 7, 2024

A few days ago a manager from a sister team asked me why I have “Chaotic good” in that field “What I do” of my work profile. The question now occupies my mind so I need to drain it into something.

In short, it is a silly meme from a few years ago as I decided to use as a work motto instead of “stupidity and courage” as I fell that the new one better reflects the type of work was doing at the time.

Classic Salad Dressing

October 3, 2024

Ingredients

I have no clue how much of ingredients you will need all measurements are approximate.

Balsamic vinegar, or any acidic substance. Try lime juice, it has nice aroma touch. 1 or 2 tbs. Probably.
Olive Oil or any oil try peanut oil. 3 or 6 tbs. Who knows how much.
Mustard. This thing stabilizes to keep the emulsion. A bit of it.
Salt. A pinch?
Pepper. A bit.

The process

Mix everything together and stir for a minute till the emulsion built up.