Archives: AI/Machine Learning

google SRE book notes No ratings yet.

google SRE book notes: https://sre.google/sre-book/table-of-contents/ Risk measure aggregate availabilty = successful requests/total requests ( instead of uptime/downtime) release: Branching All code is checked into the main branch of the source code tree (mainline). However, most major projects don’t release directly from the mainline. Instead, we branch from the mainline at a specific revision and never • Read More »


kafka msg format, how to publish, read No ratings yet.

How to Publish a kafka msg Kafka from programmer point of view is: just topic, key, value , headers https://kafka-python.readthedocs.io/en/master/apidoc/KafkaProducer.html send(topic, value=None, key=None, headers=None, partition=None, timestamp_ms=None)[source] Publish a message to a topic. Parameters: topic (str) – topic where the message will be published value (optional) – message value. Must be type bytes, or be serializable to bytes via configured value_serializer. If value • Read More »


ML workflow and pipeline orchestration No ratings yet.

Kale – Aims at simplifying the Data Science experience of deploying Kubeflow Pipelines workflows. Flyte – Easy to create concurrent, scalable, and maintainable workflows for machine learning. MLRun – Generic mechanism for data scientists to build, run, and monitor ML tasks and pipelines. Prefect – A workflow management system, designed for modern infrastructure. ZenML – An extensible open-source MLOps framework • Read More »


responsible AI No ratings yet.

General intro https://github.com/alexandrainst/responsible-ai https://ai.google/responsibilities/responsible-ai-practices/ https://www.tensorflow.org/responsible_ai   Open source implementation https://github.com/microsoft/responsible-ai-toolbox https://www.tensorflow.org/responsible_ai/api_docs https://opendatascience.com/15-open-source-responsible-ai-toolkits-and-projects-to-use-today/ Responsible AI Toolkits for AI Ethics & Privacy TensorFlow Privacy TensorFlow Privacy is a Python library that includes implementations of TensorFlow optimizers for training machine learning models with differential privacy. TensorFlow Federated TFF has been developed to facilitate open research and experimentation with Federated Learning • Read More »


Bias vs Variance in ML No ratings yet.

Somehow even wiki https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff seems not clearly explained what it is due to un-cleary math denotion. The video gave us more precision meaning of bias, variance. Expected error of algorithm the goal of the algorithm is to reduce the total error when we do the  prediction/(genelization). Thus we need to calculate the expected error of the • Read More »



c++ multi-set/multi-map, set/map and its unordered version No ratings yet.

container name implementation/underlying struct notes/ sample unordered_set  The value of an element is at the same time its key, that identifies it uniquely. hash table unordered_multiset much like unordered_set containers, but allowing different elements to have equivalent values. hash table Internally when an existing value is inserted, the data structure increases its count which is associated with each • Read More »


webassembly and web audio worklet No ratings yet.

Some notes: web assembly: a cool technology allows you to compile c/c++/rust and other languages into wasm, expose API to javascript world. Web audio worklet/worker: allows developer to intercept audio stream/custom processing A good introduction is at: https://developers.google.com/web/updates/2017/12/audio-worklet Advance pattern: https://developers.google.com/web/updates/2018/06/audio-worklet-design-pattern https://github.com/GoogleChromeLabs/web-audio-samples/tree/master/audio-worklet/design-pattern/shared-buffer/     Combination of wasm and audio worklet can do a fair amount • Read More »


How to learn pytorch from scratch 5/5 (2)

Pytoch is a quite powerful, flexible and yet popular deep learning framework. The learning curve could be steep if you do not have much deep learning background. So what is a good way to learn it? The official pytorch webiste has some great tutorials at: https://pytorch.org/tutorials/ IMHO those materials are tuned to some intermediate level • Read More »


how to prevent nvidia GPU hangup on ( debian 10) 5/5 (4)

debian 10 have nvidia GPU driver, cuda 9 etc, it is a good thing  for deep learning ( keras or pytorch). but  if let it running for some time ( without any interrupt), it hang after the power management kick in. Work around is to turn off the power management by doing: sudo systemctl mask • Read More »