[TOC]

Chapter0 前言

topic data structures and algorithms
data types stack, queue, bag, union-find, priority queue
sorting quicksort, mergesort, heapsort
searching BST, red-black BST, hash table
graphs BFS, DFS, Prim, Kruskal, Dijkstra
strings radix sorts, tries, KMP, regexps, data compression
advanced B-tree, suffix array, maxflow

第1章 基础

1. 基础编程模型

  • 表达式, && 运算符高于 ||

  • Java 中创建数组

    1. 声明数组的名字和类型

    2. 创建数组

    3. 初始化数组元素

       // 完整模式
       double[] a;               // 声明数组
       a = new double[N];        // 创建数组
       for (int i = 0; i < N; i++) // 初始化数组
           a[i] = 0.0;
          
       // 简化写法, 默认值就是 0.0
       double[] a = new double[N];
          
       // 声明初始化
       int[] a = {1, 1, 2, 3, 5, 8}
      
  • 自动转换, 例如 Java 在连接字符串(使用 + 号)时会自动将任意数据类型的值转换为字符串, 即有一个参数为字符串, 结果也是一个字符串

  • println() 会附件一个换行符, printf() 能够格式化输出

  • Java 字节码

    • 运行于 Java 虚拟机上的代码, 保证 Java 运行在各种设备上
  • Java 中 1/0 会出现异常, 1.0/0.0 的值是 Infinity

  • for 和 while 的区别: for 循环结束其递增变量不可用, 而 while 循环可用

  • int a[] 和 int[] a, 两者是等价的, 前者是 C 语言中的方式, 后者是 Java 提倡的

  • 索引是 0 开头的原因, 来源于机器语言要计算数组元素的地址, 需要起始地址加上索引, 如果是 1, 会浪费 第一个元素的空间

  • Java 中, 一个静态方法不能将另一个静态方法作为参数

2. 数据抽象

3. 背包, 队列和栈

  • 数据类型: Bag, Queue, Stack
  • 泛型, 迭代
  • 链式数据结构的重要性, 链表
  1. 泛型, 参数化类型
  2. 背包, 一种不支持从中删除元素的集合数据类型

4. 算法分析

  1. 科学方法

    • 观察特点, 精确测量
    • 假设模型
    • 预测
    • 根据结果与预测, 改进模型
    • 直到预测与观察一致
  2. 原则

    • 实验可复现(reproducible)
    • 可证伪(falsifiable)
  3. order of growth

order of growth name typical code framework description example T(2N) / T(N)
1 constant a = b + c; statement add two numbers 1
log N logarithmic while (N > 1) { N = N / 2; … } divide in half binary search ~1
N linear for (int i = 0; i < N; i++) {… } loop find the maximum 2
N log N linearithmic [see mergesort lecture] divide and conquer mergesort ~2
N2 quadratic for (int i = 0; i < N; i++) for (int j = 0; j < N; j++){… } double loop check all pairs 4
N3 cubic for (int i = 0; i < N; i++) for (int j = 0; j < N; j++)for (int k = 0; k < N; k++) {… } triple loop check all triples
2N exponential [see combinatorial search lecture] exhaustive search check all subsets T(N)

5. 案例研究: union-find 算法

  1. 动态连通性
  • union-find 算法的 API,
public class UF
UF(int N) initialize union-find data structure with N objects (0 to N – 1)(以整数标识(0到 N-1)初始化 N 个触点)
void union(int p, int q) add connection between p and q (在 p 和 q 之间添加一条连接)
boolean connected(int p, int q) are p and q in the same component? (如果 p 和 q 在同一分量中则返回 TRUE)
int find(int p) component identifier for p (0 to N – 1) (p 所在的分量的标志符(0 到 N-1))
int count() number of components (连通分量的数量)
  • 如果两个触点在不同的分量中
    • union(): 将两个分量归并
    • find(): 返回给定触点所在的联通分量的标志符
    • connected: 判断两个触点是否存在于同一分量之中
    • count(): 返回所有连通分量的数量
  • 开始有 N 个分量, union() 操作后分量数为 N-1
  1. 实现
  • quick-find 算法, 代码:

      ::JAVA
      public int find(int p)
      {   return id[p]; }
        
      public void union(int p, int q)
      {   // 将 p 和 q 归并到相同的分量中
          int pID = find(p);
          int qID = find(q);
        
          // 如果 p 和 q 已经在相同的分量之中则不做改变
          if (pID == qID) return;
        
          // 将 p 的分量重命名为 q 的 id  
          for (int i = 0; i< id.length; i++)
              if (id[i] == pID) id[i] = qID;
          count--;
      }
    
  • quick-find 分析

    • 使用此算法直到最后只剩一个连通分量, 那么最少需要(N + 3)(N - 1)次数组访问, 即 N^2 级别
    • 调用 N-1 次 union(), 每次访问数组 N+3
  • quick-union 算法

      ::java
      public class QuickUnionUF {
          private int[] id;
            
          public QuickUnionUF(int N) {
              id = new int[N];
              for (int i = 0; i < N; i++) id[i] = i;
          }
          private int root(int i){
              while (i != id[i]) i = id[i];
              return i;
          }
        
          public boolean connected(int p, int q) {
              return root(p) == root(q);
          }
            
          public void union(int p, int q) {
              int i = root(p);
              int j = root(q);
              id[i] = j;
          }
      }
    
  • 加权 quick-union(Weighted quick-union)

    • 特点:

      • Modify quick-union to avoid tall trees.
      • Keep track of size of each tree (number of objects).
      • Balance by linking root of smaller tree to root of larger tree.
      • make the root of the smaller tree (in terms of the number of nodes) point to the root of the larger tree.
    • code:

        ::java
        public class WeightedQuickUnionUF {
            private int[] id;
            private int[] sz; // 各个根节点对应分量的大小
            private int count; // 连通分量的数量
                  
            public  WeightedQuickUnionUF(int N) {
                count = N
                id = new int[N];
                for (int i = 0; i < N; i++) id[i] = i;
                sz = new int[N];
                for (int i = 0; i < N; i++) sz[i] = i;
            }
            private int root(int i){
                // 跟随链接找到根节点
                while (i != id[i]) i = id[i];
                return i;
            }
            public int count() {
                return count;
            }
            public boolean connected(int p, int q) {
                return root(p) == root(q);
            }                
            public void union(int p, int q) {
                int i = root(p);
                int j = root(q);
                id[i] = j;
                if (i==j) return;
                // 将小树的根节点连接到大树的根节点
                if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i];}
                else               { id[j] = i; sz[i] += sz[j];}
                count--;
            }
        }
      
  • 树的深度为什么是 lg 数量级? 设一棵完全二叉树节点个数为N,高度为h。所以总节点个数N满足以下不等式:

    • 1 + 21 + 22 +……+ 2h-1 < N <= 1 + 21 + 22 +……+ 2h
    • 即 2h - 1 < N <= 2h+1 - 1
    • 所以 2h < N+1 <= 2h+1,
    • h < log2(N+1) <= h+1 (两边同取以2为底的对数得)
    • 算法上 lg 一般默认 以 2 为底的 log
  • 比较

    • Quick-find defect.
      • Union too expensive (N array accesses).
      • Trees are flat, but too expensive to keep them flat.
    • Quick-union defect.
      • Trees can get tall.
      • Find too expensive (could be N array accesses).
    algorithm initialize union find
    quick-find N N 1
    quick-union N N† N
    weighted QU N lgN† lg N
    quick-union is worst case
    † includes cost of finding roots
  1. 展望

  2. coursera

    • 测验解答博客Interview Questions: Union–Find (ungraded)

    • Percolation

      • 每一格开放的概率为 p, 阻塞概率为(1-p), (Each site is open with probability p )
      • 渗透系统: 当顶层格子连通底层格子
      • 设 常数 p*
        • 当 p > p*, 认为渗透系统
        • 当 p < p*, 认为系统不渗透
    • 算法实现

      • Q. How to check whether an N-by-N system percolates?
        • Create an object for each site and name them 0 to N 2 – 1.
        • Sites are in same component if connected by open sites.
        • Percolates iff any site on bottom row is connected to site on top row.
      • C・lever trick. Introduce 2 virtual sites (and connections to top and bottom).
        • Percolates iff virtual top site is connected to virtual bottom site.
      • Q. How to model opening a new site?
        • A. Mark new site as open; connect it to all of its adjacent open sites.
      • Q. What is percolation threshold p* ?
        • A. About 0.592746(constant known only via simulation) for large square lattices.
    • Steps to developing a usable algorithm.

      • Model the problem.
      • Find an algorithm to solve it. ・Fast enough? Fits in memory?
      • If not, figure out why.
      • Find a way to address the problem. ・Iterate until satisfied.
    • 蒙特卡洛模拟(Monte Carlo simulation)

      • 构造或描述概率过程
      • 实现从已知概率分布抽样
      • 建立各种估计量
    • 利用蒙特卡洛模拟渗透系统

      • Initialize all sites to be blocked.
      • Repeat the following until the system percolates
        • Choose a site uniformly at random among all blocked sites.
        • Open the site.
      • The fraction of sites that are opened when the system percolates provides an estimate of the percolation threshold.

第2章 排序

0. Coursera

  1. Stacks and Queues

    • Client: program using operations defined in interface.
    • Implementation: actual code implementing operations.
    • Interface: description of data type, basic opera
    • resizing arrays
      • 当字典剩余空间用完, 新建字典大小为原字典的两倍, 然后拷贝所有数据
      • shrink, 临界点为原字典 1/4 时, 新建 1/2 的字典, 然后拷贝数据到新字典
      • Invariant. Array is between 25% and 100% full
  2. generics(泛型)

    • 将类型参数化, 泛型就是我定义一个类(叫泛型类型),这个类有个类型参数:class Stack

    • 使用的时候(叫做泛型实例)指定一下类型就可以了:Stack s = … 泛型就是多个类型的共同抽象。

      Stack s = new Stack(); Apple a = new Apple(); Orange b = new Orange(); s.push(a); s.push(b); // compile-time error a = s.pop();

    • java 中的 iterable: has method that returns an Iterator

    • Iterator: has methods

      • hasNext()
      • next()
  3. Interview Questions: Stacks and Queues

    1. Queue with two stacks. Implement a queue with two stacks so that each queue operations takes a constant amortized number of stack operations.

      1. 方法一:

        • enQueue(q, x)

          1. 当 stack1 非空, 把 stack1 全部数据 push 到 stack2
          2. Push x 到 stack1
          3. push 所有数据 从 stack2 到 stack1
        • dnQueue(q)

          1. 如果 stack1 为空, 报错
          2. stack1.pop()
      2. 方法二:

        • enQueue(q, x)
          1. stack1.push(x)
        • deQueue(q)
          1. 如果 stack1 和 stack2 都为空, 报错
          2. 如果 stack2 为空, stack1 非空, 把 stack1 的全部数据 push 到 stack2 中
          3. stack2.pop()
      3. 方法三, 一个 stack,

        • enQueue(x)

          1. stack1.push(x)
        • deQueue:

          1. 如果 stack1 为空, 报错
          2. 如果 stack1 只有一个元素, 返回此元素
          3. 将 stack1 递归 pop, pop 的每个值储存到变量 res, 然后 stack1.push(res), 直到 stack1 只有一个元素, 返回 此元素
        • code:

            ::java           
            // Java Program to implement a queue using one stack
            import java.util.Stack;
                           
            public class QOneStack 
            {
                //class of queue having two stacks
                static class Queue
                {
                    Stack<Integer> stack1;
                }
                               
                /* Function to push an item to stack*/
                static void push(Stack<Integer> top_ref,int new_data)
                {
                    /* put in the data */
                    top_ref.push(new_data);
                }
                               
                /* Function to pop an item from stack*/
                static int pop(Stack<Integer> top_ref)
                {
                    /*If stack is empty then error */
                    if(top_ref == null)
                    {
                        System.out.println("Stack Underflow");
                        System.exit(0);    
                    }
                    //return element from stack
                    return top_ref.pop(); 
                }
                               
                /* Function to enqueue an item to queue */
                static void enQueue(Queue q,int x)
                {
                    push(q.stack1,x);
                }
                               
                /* Function to dequeue an item from queue */
                static int deQueue(Queue q)
                {
                    int x,res=0;
                    /* If the stacks is empty then error */
                    if(q.stack1.isEmpty())
                    {
                        System.out.println("Q is Empty");
                        System.exit(0);
                    }
                    //Check if it is a last element of stack
                    else if(q.stack1.size() == 1)
                    {
                        return pop(q.stack1);
                    }
                    else
                    {
                        /* pop an item from the stack1 */
                        x=pop(q.stack1);
                                       
                        /* store the last dequeued item */
                        res = deQueue(q);
                                       
                        /* push everything back to stack1 */
                        push(q.stack1,x);
                        return res;
                    }
                    return 0;
                }
                               
                /* Driver function to test above functions */
                public static void main(String[] args) 
                {
                    /* Create a queue with items 1 2 3*/
                    Queue q = new Queue();
                    q.stack1 = new Stack<>();
                                   
                    enQueue(q, 1);
                    enQueue(q, 2);
                    enQueue(q, 3);
                                   
                    /* Dequeue items */
                    System.out.print(deQueue(q) + " ");
                    System.out.print(deQueue(q) + " ");
                    System.out.print(deQueue(q) + " ");
                }
          
    2. Stack with max. Create a data structure that efficiently supports the stack operations (push and pop) and also a return-the-maximum operation. Assume the elements are reals numbers so that you can compare them.

    3. Java generics. Explain why Java prohibits generic array creation.

  4. Elementary Sorts

1. 初级排序算法

2. 归并排序

3. 快速排序

4. 优先队列

5. 应用

第3章 查找

第4章 图

第5章 字符串

第6章 背景