Others
Streaming and Sketching Algorithms
Sketching (Compressed data structures)
Given data and a function f, compress the data into a sketch to allow computation, just given the sketch
Streaming
Maintain sketch on the fly, as the data is updated
https://www.sketchingbigdata.org
https://www.youtube.com/watch?v=xbTM3t26xLk
Streaming Algorithms
Reservoir Algorithm
Reservoir sampling is a family of randomized algorithms for choosing a simple random sample without replacement of k items from a population of unknown size n in a single pass over the items. The size of the population n is not known to the algorithm and is typically too large to fit all n items into main memory. The population is revealed to the algorithm over time, and the algorithm cannot look back at previous items. At any point, the current state of the algorithm must permit extraction of a simple random sample without replacement of size k over the part of the population seen so far
On the ith iteration of our loop to pick a random element, let's assume we already picked an element uniformly from [0, i - 1]. In order to maintain the loop invariant, we would need to pick the ith element as the new random element at1 / (i + 1)chance. For the base case wherei = 0, let's say the random element is the first one. Then we know it works because
- Fori = 0, we would've picked uniformly from [0, 0].
- Fori > 0, before the loop began, any elementKin [0, i - 1]had1 / ichance of being chosen as the random element. We wantKto have1 / (i + 1)chance of being chosen after the iteration. This is the case since the chance of having being chosen already but not getting swapped with theithelement is1 / i (1 - (1 / (i + 1)))which is1 / i i / (i + 1)or1 / (i + 1)
import random
def pick(big_stream):
random_element = None
for i, e in enumerate(big_stream):
if i == 0:
random_element = e
if random.randint(1, i + 1) == 1:
random_element = e
return random_element
https://en.wikipedia.org/wiki/Reservoir_sampling
Evolutionary Algorithm
https://en.wikipedia.org/wiki/Evolutionary_algorithm
- Artificial development
- Cellular evolutionary algorithm
- Cultural algorithm
- Differential evolution
- Effective fitness
- Evolution strategy
- Gaussian adaptation
- Evolutionary multimodal optimization
- Grammatical evolution
- Particle swarm optimization
In computational science, particle swarm optimization (PSO) is a computational method that optimizes a problem by iteratively trying to improve acandidate solution with regard to a given measure of quality. It solves a problem by having a population of candidate solutions, here dubbed particles, and moving these particles around in the search-space according to simple mathematical formulae over the particle'sposition and velocity. Each particle's movement is influenced by its local best known position, but is also guided toward the best known positions in the search-space, which are updated as better positions are found by other particles. This is expected to move the swarm toward the best solutions.
https://en.wikipedia.org/wiki/Particle_swarm_optimization
- Memetic algorithm
- Natural evolution strategy
- Promoter based genetic algorithm
- Spiral optimization algorithm
Genetic Algorithm
- Chromosome
- Clonal selection algorithm
- Crossover
- Mutation
- Genetic memory
- Genetic fuzzy systems
- Selection
- Fly algorithm
Genetic Programming
- Cartesian genetic programming
- Linear genetic programming
- Multi expression programming
- Schema
- Eurisko
- Parity benchmark
Thompson's construction
It is an algorithm for transforming a regular expression into an equivalent non-deterministic finite automata (NFA). This NFA can be used to match strings against the regular expression.
Approximate counting algorithm
Theapproximate counting algorithmallows the counting of a large number of events using a small amount of memory. Invented in 1977 by Robert Morris (cryptographer) of Bell Labs, it uses probabilistic techniques to increment the counter. It was fully analyzed in the early 1980s by Philippe Flajolet of INRIA Rocquencourt, who coined the nameapproximate counting, and strongly contributed to its recognition among the research community. The algorithm is considered one of the precursors of streaming algorithms, and the more general problem of determining the frequency moments of a data stream has been central to the field.
https://en.wikipedia.org/wiki/Approximate_counting_algorithm