《算法小红书》读书笔记
Contents
[TOC]
Chapter0 前言
- 配套网站
- Coursera 在线课程: Part 1 (data, sorting, searching) Part 2
topic | data structures and algorithms |
---|---|
data | types stack, queue, bag, union-find, priority queue |
sorting | quicksort, mergesort, heapsort |
searching | BST, red-black BST, hash table |
graphs | BFS, DFS, Prim, Kruskal, Dijkstra |
strings | radix sorts, tries, KMP, regexps, data compression |
advanced | B-tree, suffix array, maxflow |
第1章 基础
1. 基础编程模型
-
表达式, && 运算符高于 ||
-
Java 中创建数组
-
声明数组的名字和类型
-
创建数组
-
初始化数组元素
// 完整模式 double[] a; // 声明数组 a = new double[N]; // 创建数组 for (int i = 0; i < N; i++) // 初始化数组 a[i] = 0.0; // 简化写法, 默认值就是 0.0 double[] a = new double[N]; // 声明初始化 int[] a = {1, 1, 2, 3, 5, 8}
-
-
自动转换, 例如 Java 在连接字符串(使用 + 号)时会自动将任意数据类型的值转换为字符串, 即有一个参数为字符串, 结果也是一个字符串
-
println() 会附件一个换行符, printf() 能够格式化输出
-
Java 字节码
- 运行于 Java 虚拟机上的代码, 保证 Java 运行在各种设备上
-
Java 中 1/0 会出现异常, 1.0/0.0 的值是 Infinity
-
for 和 while 的区别: for 循环结束其递增变量不可用, 而 while 循环可用
-
int a[] 和 int[] a, 两者是等价的, 前者是 C 语言中的方式, 后者是 Java 提倡的
-
索引是 0 开头的原因, 来源于机器语言要计算数组元素的地址, 需要起始地址加上索引, 如果是 1, 会浪费 第一个元素的空间
-
Java 中, 一个静态方法不能将另一个静态方法作为参数
2. 数据抽象
3. 背包, 队列和栈
- 数据类型: Bag, Queue, Stack
- 泛型, 迭代
- 链式数据结构的重要性, 链表
- 泛型, 参数化类型
- 背包, 一种不支持从中删除元素的集合数据类型
4. 算法分析
-
科学方法
- 观察特点, 精确测量
- 假设模型
- 预测
- 根据结果与预测, 改进模型
- 直到预测与观察一致
-
原则
- 实验可复现(reproducible)
- 可证伪(falsifiable)
-
order of growth
order of growth | name | typical code framework | description | example | T(2N) / T(N) |
---|---|---|---|---|---|
1 | constant | a = b + c; | statement | add two numbers | 1 |
log N | logarithmic | while (N > 1) { N = N / 2; … } | divide in half | binary search | ~1 |
N | linear | for (int i = 0; i < N; i++) {… } | loop | find the maximum | 2 |
N log N | linearithmic | [see mergesort lecture] | divide and conquer | mergesort | ~2 |
N2 | quadratic | for (int i = 0; i < N; i++) for (int j = 0; j < N; j++){… } | double loop | check all pairs | 4 |
N3 | cubic | for (int i = 0; i < N; i++) | for (int j = 0; j < N; j++)for (int k = 0; k < N; k++) {… } | triple loop | check all triples |
2N | exponential | [see combinatorial search lecture] | exhaustive search | check all subsets | T(N) |
5. 案例研究: union-find 算法
- 动态连通性
- union-find 算法的 API,
public class UF | ||
---|---|---|
UF(int N) | initialize union-find data structure with N objects (0 to N – 1)(以整数标识(0到 N-1)初始化 N 个触点) | |
void | union(int p, int q) | add connection between p and q (在 p 和 q 之间添加一条连接) |
boolean | connected(int p, int q) | are p and q in the same component? (如果 p 和 q 在同一分量中则返回 TRUE) |
int | find(int p) | component identifier for p (0 to N – 1) (p 所在的分量的标志符(0 到 N-1)) |
int | count() | number of components (连通分量的数量) |
- 如果两个触点在不同的分量中
- union(): 将两个分量归并
- find(): 返回给定触点所在的联通分量的标志符
- connected: 判断两个触点是否存在于同一分量之中
- count(): 返回所有连通分量的数量
- 开始有 N 个分量, union() 操作后分量数为 N-1
- 实现
-
quick-find 算法, 代码:
::JAVA public int find(int p) { return id[p]; } public void union(int p, int q) { // 将 p 和 q 归并到相同的分量中 int pID = find(p); int qID = find(q); // 如果 p 和 q 已经在相同的分量之中则不做改变 if (pID == qID) return; // 将 p 的分量重命名为 q 的 id for (int i = 0; i< id.length; i++) if (id[i] == pID) id[i] = qID; count--; }
-
quick-find 分析
- 使用此算法直到最后只剩一个连通分量, 那么最少需要(N + 3)(N - 1)次数组访问, 即 N^2 级别
- 调用 N-1 次 union(), 每次访问数组 N+3
-
quick-union 算法
::java public class QuickUnionUF { private int[] id; public QuickUnionUF(int N) { id = new int[N]; for (int i = 0; i < N; i++) id[i] = i; } private int root(int i){ while (i != id[i]) i = id[i]; return i; } public boolean connected(int p, int q) { return root(p) == root(q); } public void union(int p, int q) { int i = root(p); int j = root(q); id[i] = j; } }
-
加权 quick-union(Weighted quick-union)
-
特点:
- Modify quick-union to avoid tall trees.
- Keep track of size of each tree (number of objects).
- Balance by linking root of smaller tree to root of larger tree.
- make the root of the smaller tree (in terms of the number of nodes) point to the root of the larger tree.
-
code:
::java public class WeightedQuickUnionUF { private int[] id; private int[] sz; // 各个根节点对应分量的大小 private int count; // 连通分量的数量 public WeightedQuickUnionUF(int N) { count = N id = new int[N]; for (int i = 0; i < N; i++) id[i] = i; sz = new int[N]; for (int i = 0; i < N; i++) sz[i] = i; } private int root(int i){ // 跟随链接找到根节点 while (i != id[i]) i = id[i]; return i; } public int count() { return count; } public boolean connected(int p, int q) { return root(p) == root(q); } public void union(int p, int q) { int i = root(p); int j = root(q); id[i] = j; if (i==j) return; // 将小树的根节点连接到大树的根节点 if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i];} else { id[j] = i; sz[i] += sz[j];} count--; } }
-
-
树的深度为什么是 lg 数量级? 设一棵完全二叉树节点个数为N,高度为h。所以总节点个数N满足以下不等式:
- 1 + 21 + 22 +……+ 2h-1 < N <= 1 + 21 + 22 +……+ 2h
- 即 2h - 1 < N <= 2h+1 - 1
- 所以 2h < N+1 <= 2h+1,
- h < log2(N+1) <= h+1 (两边同取以2为底的对数得)
- 算法上 lg 一般默认 以 2 为底的 log
-
比较
- Quick-find defect.
- Union too expensive (N array accesses).
- Trees are flat, but too expensive to keep them flat.
- Quick-union defect.
- Trees can get tall.
- Find too expensive (could be N array accesses).
algorithm initialize union find quick-find N N 1 quick-union N N† N weighted QU N lgN† lg N quick-union is worst case † includes cost of finding roots - Quick-find defect.
-
展望
-
coursera
-
Percolation
- 每一格开放的概率为 p, 阻塞概率为(1-p), (Each site is open with probability p )
- 渗透系统: 当顶层格子连通底层格子
- 设 常数 p*
- 当 p > p*, 认为渗透系统
- 当 p < p*, 认为系统不渗透
-
算法实现
- Q. How to check whether an N-by-N system percolates?
- Create an object for each site and name them 0 to N 2 – 1.
- Sites are in same component if connected by open sites.
- Percolates iff any site on bottom row is connected to site on top row.
- C・lever trick. Introduce 2 virtual sites (and connections to top and bottom).
- Percolates iff virtual top site is connected to virtual bottom site.
- Q. How to model opening a new site?
- A. Mark new site as open; connect it to all of its adjacent open sites.
- Q. What is percolation threshold p* ?
- A. About 0.592746(constant known only via simulation) for large square lattices.
- Q. How to check whether an N-by-N system percolates?
-
Steps to developing a usable algorithm.
- Model the problem.
- Find an algorithm to solve it. ・Fast enough? Fits in memory?
- If not, figure out why.
- Find a way to address the problem. ・Iterate until satisfied.
-
蒙特卡洛模拟(Monte Carlo simulation)
- 构造或描述概率过程
- 实现从已知概率分布抽样
- 建立各种估计量
-
利用蒙特卡洛模拟渗透系统
- Initialize all sites to be blocked.
- Repeat the following until the system percolates
- Choose a site uniformly at random among all blocked sites.
- Open the site.
- The fraction of sites that are opened when the system percolates provides an estimate of the percolation threshold.
第2章 排序
0. Coursera
-
Stacks and Queues
- Client: program using operations defined in interface.
- Implementation: actual code implementing operations.
- Interface: description of data type, basic opera
- resizing arrays
- 当字典剩余空间用完, 新建字典大小为原字典的两倍, 然后拷贝所有数据
- shrink, 临界点为原字典 1/4 时, 新建 1/2 的字典, 然后拷贝数据到新字典
- Invariant. Array is between 25% and 100% full
-
generics(泛型)
-
将类型参数化, 泛型就是我定义一个类(叫泛型类型),这个类有个类型参数:class Stack,
-
使用的时候(叫做泛型实例)指定一下类型就可以了:Stack s = … 泛型就是多个类型的共同抽象。
Stack s = new Stack(); Apple a = new Apple(); Orange b = new Orange(); s.push(a); s.push(b); // compile-time error a = s.pop();
-
java 中的 iterable: has method that returns an Iterator
-
Iterator: has methods
- hasNext()
- next()
-
-
Interview Questions: Stacks and Queues
-
Queue with two stacks. Implement a queue with two stacks so that each queue operations takes a constant amortized number of stack operations.
-
方法一:
-
enQueue(q, x)
- 当 stack1 非空, 把 stack1 全部数据 push 到 stack2
- Push x 到 stack1
- push 所有数据 从 stack2 到 stack1
-
dnQueue(q)
- 如果 stack1 为空, 报错
- stack1.pop()
-
-
方法二:
- enQueue(q, x)
- stack1.push(x)
- deQueue(q)
- 如果 stack1 和 stack2 都为空, 报错
- 如果 stack2 为空, stack1 非空, 把 stack1 的全部数据 push 到 stack2 中
- stack2.pop()
- enQueue(q, x)
-
方法三, 一个 stack,
-
enQueue(x)
- stack1.push(x)
-
deQueue:
- 如果 stack1 为空, 报错
- 如果 stack1 只有一个元素, 返回此元素
- 将 stack1 递归 pop, pop 的每个值储存到变量 res, 然后 stack1.push(res), 直到 stack1 只有一个元素, 返回 此元素
-
code:
::java // Java Program to implement a queue using one stack import java.util.Stack; public class QOneStack { //class of queue having two stacks static class Queue { Stack<Integer> stack1; } /* Function to push an item to stack*/ static void push(Stack<Integer> top_ref,int new_data) { /* put in the data */ top_ref.push(new_data); } /* Function to pop an item from stack*/ static int pop(Stack<Integer> top_ref) { /*If stack is empty then error */ if(top_ref == null) { System.out.println("Stack Underflow"); System.exit(0); } //return element from stack return top_ref.pop(); } /* Function to enqueue an item to queue */ static void enQueue(Queue q,int x) { push(q.stack1,x); } /* Function to dequeue an item from queue */ static int deQueue(Queue q) { int x,res=0; /* If the stacks is empty then error */ if(q.stack1.isEmpty()) { System.out.println("Q is Empty"); System.exit(0); } //Check if it is a last element of stack else if(q.stack1.size() == 1) { return pop(q.stack1); } else { /* pop an item from the stack1 */ x=pop(q.stack1); /* store the last dequeued item */ res = deQueue(q); /* push everything back to stack1 */ push(q.stack1,x); return res; } return 0; } /* Driver function to test above functions */ public static void main(String[] args) { /* Create a queue with items 1 2 3*/ Queue q = new Queue(); q.stack1 = new Stack<>(); enQueue(q, 1); enQueue(q, 2); enQueue(q, 3); /* Dequeue items */ System.out.print(deQueue(q) + " "); System.out.print(deQueue(q) + " "); System.out.print(deQueue(q) + " "); }
-
-
-
Stack with max. Create a data structure that efficiently supports the stack operations (push and pop) and also a return-the-maximum operation. Assume the elements are reals numbers so that you can compare them.
-
Java generics. Explain why Java prohibits generic array creation.
-
-
Elementary Sorts
1. 初级排序算法
2. 归并排序
3. 快速排序
4. 优先队列
5. 应用
第3章 查找
第4章 图
第5章 字符串
第6章 背景
Author yangly
LastMod 2018-05-20