DME - Data Mining and Exploration (INFR 11007) Review

This is my review note of the DME course (Data Mining and Exploration (INFR11007), 2019) at the University of Edinburgh. The note include every steps to develop machine learning models and related knowledge, e.g., Exploratory Data Analysis (EDA), Data Preprocessing, Modeling and Model Evaluations. Remeber to read the ‘Lab’ section of each chapter


Data Analysis Process

1. Exploratory Data Analysis

1.1 Numberical Data Description

1.1.1 Location

  • Non-robust Measure

    • Sample Mean (arithmetic mean or average): $\hat{x} = \frac{1}{n}\sum_{i=1}^{n} x_{i}$
      • for random variable: $\mathbb{E}[x] = \int xp(x) dx$
  • Robust Measure

    • Median:

      $$ median(x) = \begin{cases} x_{[(n+1)\mathbin{/}2]}& \text{; if $n$ is odd}\\ \frac{1}{2}[x_{(n\mathbin{/}2)}+x_{(n\mathbin{/}2)+1}]& \text{; if $n$ is even} \end{cases} $$
    • Mode: Value that occurs most frequent

    • $\alpha_{th}$ Sample Quantile (rough data point, i.e. $q_{\alpha} \approx x_{([n\alpha])}$)
      • $Q_{1} = q_{0.25}$, $Q_{2} = q_{0.5}$, $Q_{3} = q_{0.75}$

OpenCV - Python api 笔记

Opencv-python是OpenCv的python API,包括数百种计算机视觉算法。这个页面记录了一些常用的Opencv-python函数,以便作为我的快速参考。

1. 安装和使用 Installation and Usage

  • 安装
1
pip install opencv-python

事实上一共有四种不同的packages,安装其中一个即可,四个packages都用同一个名字cv2(对于其他的packages,详见Documentation)。

2. OpenCV中的GUI特性

2.1 图像基本操作(读取,显示,保存)

三个函数cv.imread(), cv.imshow(), cv.imwrite()分别用于读取;显示和保存图像。


Numpy&Pandas Tutorial

Numpy和Pandas对python中的数据处理很重要。尤其对于数据分析/挖掘,Pandas几乎不可或缺。写tutorial的起因是因为一次面试中被问到numpy中去重用哪个函数,发现自己对numpy的不熟悉,所以希望以此加深印象…(haven’t started yet)


Your browser is out-of-date!

Update your browser to view this website correctly.&npsb;Update my browser now

×