Review: The Geek Way by Andrew McAfee

The book focuses and explores in depth the key traits of successful companies like Amazon, Apple, Microsoft etc. It differentiates four main areas: ownership, openness, speed, and science. From all four openness is mentioned as the key to all other features as it naturally enables them.

For myself, I summarize the book as two aspects: environment, and speed.

The company needs to focus on creation of an environment where ideas can move freely, basic needs like access to code (including changing the code), data, documentation, support easily available. In addition to that negative feedback not only welcomed but actively sought so any issues can be discovered as soon as possible so they do not become too large of a problem. This requires an atmosphere where people are not punished for mistakes and so there is no implicit incentive for covers up.

Read More…

Bright Cocaine: colors and dopamine

Intro

As continuation of my attempts to regain control over my attention, two month ago I switched my phone to first black, and white mode, and then I applied the same filter at 30% on my laptops and then reduced the filter intensity to 20-30% on all my devices. I helped a bit, actually black, and white helps quite significantly but it is a bit tiering to use.

Read More…

Python Pipes

I’ve always wanted to have a way to build data processing pipelines in Python using pipes, like this range(10) | F(is_odd) | P(lambda x: x * 2), instead of functions and generators and maps and loops. So I’ve tried …

The idea is pretty simple: let’s create a class with implemented OR and ROR operators, the pipes.

    def __or__(self, other):
        other.source = self
        return other

    def __ror__(self, other):
        self.source = (
            iter(other)
            if not isinstance(other, (str, bytes)) and hasattr(other, "__iter__")
            else other
        )
        return self

The tricky part was implementation of __next__ since I wanted it to be a lazy operation. After a few trials and errors I’ve ended up with a pretty simple approach where the wrapping class implementing the pipe will call next to its source, added by OR or ROR, apply a transformation and then return the result of the transformation.

Read More…

All Things You Hear

I like the idea I got from one of Joscha Bach posts that everything we hear is automatically executed by our brain like it is own thoughts even when we don’t pay attention to it. And that it is a security vulnerability that it opens a gate for all sorts of exploitations. I keep returning to this thought frequently and so I decide to put it on “paper”.

The summary of my thoughts on the topic. We have that loop hole that can be exploit in multiple ways from little verbal abuse, or manipulation to full-fledged propaganda attacks. The most devious part is that it can crawl under your skin even if you don’t even pay attention to it. It is probably even more contagious when you are not pain attention since your guard is off. You can resist certain ideas when you hear then once, twice, ten times but after a hundred of repetitions they will eventually get through unless there is something in you that will make those ideas completely unacceptable for you.

Read More…

Recommendation System on HNSW and Exponential Moving Averages

Intro

I was reading the original paper on “Hierarchical Navigable Small Worlds (HNSW)” https://arxiv.org/abs/1603.09320 which I found much easier to understand than all those YouTube videos I tried to watch and articles to read. HNSW is a probabilistic data structure for searching neighbors in multi-dimensional space. One of practical applications is search of semantically close objects. Reading that paper and some other activities made me curious if I can quickly implement a recommendation system which combines three things: HNSW, moving averages, and randomness.

Read More…

Moving Averages

I was curious about use of averaged vector embedding for recommendation purposes, and then I started wondering if instead of averaging I should try other metrics like median or top percentiles to focus on more frequent scenarios and reduce the influence of outliers.

And then the question was: imaging that you want to use it in production, how can you compute averaged embedding for millions of users ideally with instant updates and without offline data processing in bulk.

Read More…

Chaotic Good

A few days ago a manager from a sister team asked me why I have “Chaotic good” in that field “What I do” of my work profile. The question now occupies my mind so I need to drain it into something.

In short, it is a silly meme from a few years ago as I decided to use as a work motto instead of “stupidity and courage” as I fell that the new one better reflects the type of work was doing at the time.

Read More…

Classic Salad Dressing

Ingredients

I have no clue how much of ingredients you will need all measurements are approximate.

  • Balsamic vinegar, or any acidic substance. Try lime juice, it has nice aroma touch. 1 or 2 tbs. Probably.
  • Olive Oil or any oil try peanut oil. 3 or 6 tbs. Who knows how much.
  • Mustard. This thing stabilizes to keep the emulsion. A bit of it.
  • Salt. A pinch?
  • Pepper. A bit.

The process

Mix everything together and stir for a minute till the emulsion built up.

Read More…

Data Sharding(Partitioning) Algorithms

I used to work close with incredibly smart people who was dealing with things like data sharding on daily basis from them I learned a lot on that topic. Later I moved to a different role where that knowledge was not needed and faded away over the time. Here I’m trying to reclaim to myself that long forgotten knowledge.

Intro

Sharding is a process of assigning an item to a shard - a smaller chunk of data out of a large database or other service. The general idea is that we can distribute data or service across multiple locations and handle large volumes of data or handle more requests and with replication we can scale even more and make the system more resilient etc. But we need to have clear rules on how we assign partitions aka shards so that we can route requests to the right location.

Read More…