# machine_learning

**Repository Path**: lundechen/machine_learning

## Basic Information

- **Project Name**: machine_learning
- **Description**: Shanghai University Machine Learning (ML01) course
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 13
- **Forks**: 5
- **Created**: 2021-04-08
- **Last Updated**: 2025-08-06

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# ML01 Machine Learning, UTSEUS, Shanghai University

## QR Code

![](/img/qr.png)

## Language

English. For everything.

## Where and When

### Tencent Meeting

For each session, please always join Tencent Meeting (VooV Meeting):
- Room ID：958 9491 5777

### Laptop

For each session, **please bring your own Laptop!**

For Fridays' practice sessions, please bring your headphone as well, because you will watch videos.

### Tuesday (Lectures and Continuous assessments)
Mostly theory.

- 08:00 - 09:40
- D309

### Thursday (Exercise sessions)
Mostly code.

- 13:00 - 14:40
- D309

### Friday (Practice sessions)
Practice sessions can help you to be more instrustry-ready.

- 10:00 - 11:40
- D309


## Lectures (Tuesday)

### Week 1
- Machine Learning overview

### Week 2
- Linear Regression

You might find those links on visualization useful:
- https://observablehq.com/@yizhe-ang/interactive-visualization-of-linear-regression
- https://visualize-it.github.io/polynomial_regression/simulation.html  
- https://uclaacm.github.io/gradient-descent-visualiser/#playground
- https://ben-karr.github.io/react-3d-gradients/

### Week 3
- Logistic Regression (for classification)

You might find those links on visualization useful:
- https://mlpocket.com/ml/supervised/logistic-regression
- https://mlu-explain.github.io/logistic-regression/
- https://playground.tensorflow.org/

### Week 4
- Neural networks

During the class, we will play a little bit with Tensorflow Playground:
- https://playground.tensorflow.org

Those materials can be very interesting as well:
- https://developers.google.com/machine-learning/crash-course/neural-networks/nodes-hidden-layers
- https://developers.google.com/machine-learning/crash-course/neural-networks/activation-functions
- https://www.v7labs.com/blog/neural-networks-activation-functions


AFTER the class, please watch those videos very carefully:

Either on YouTube:
- https://www.youtube.com/watch?v=aircAruvnKk
- https://www.youtube.com/watch?v=IHZwWFHWa-w
- https://www.youtube.com/watch?v=Ilg3gGewQ5U
- https://www.youtube.com/watch?v=tIeHLnjs5U8

Or, on Bilibili:
- https://www.bilibili.com/video/BV1bx411M7Zx/
- https://www.bilibili.com/video/BV1Ux411j7ri/
- https://www.bilibili.com/video/BV16x411V7Qg (Two videos on this one url)

Or, in a playlist:
- https://space.bilibili.com/88461692/lists/1528929


### Week 5
- Building ML web apps
- https://docs.streamlit.io/get-started
- https://ollama.com/library/

    
### Week 6
- Model selection

### Week 7
- CNN
- https://poloclub.github.io/cnn-explainer/
- https://github.com/helblazer811/ManimML


### Week 8
- GAN
- https://poloclub.github.io/ganlab/
- https://todayinsci.com/QuotationsCategories/I_Cat/Intuition-Quotations.htm


### Week 9
- AutoEncoder
- https://douglasduhaime.com/posts/visualizing-latent-spaces.html
- https://www.youtube.com/watch?v=RGBNdD3Wn-g
- https://dimensionality-reduction-293e465c2a3443e8941b016d.vercel.app/

### Week 10
- DQN
- PPO
- Transformer
- Diffusion Model
- https://poloclub.github.io/transformer-explainer/
- https://poloclub.github.io/diffusion-explainer/


## Continuous assessment (Tuesday)

Tests will take place on Tuesdays (Week 2, Week 4, Week 6, Week 8).

Each test falls in the topic of its previous week, with some extensions (e.g. some more math).

You are recommended to read materials provided by prof ahead of time, to maximize your chance of success.

In total, 4 tests will be conducted.

Tests are on paper, with book closed, no Internet, no electronic device, no discussion with classmates, no asking prof questions.

After each test, feel free to forget everything that you have learned for test preparation, because your **intuition** has already been developed and will stay with you. After you have experienced all this, you gain more confidence on youself and would be more open to new challenges. And that's the most important thing.


### Week 2 (Test 1/4)

Materials to read before test:
- all materials for lectures and exercises
- https://www.t-ott.dev/2021/11/24/animating-normal-distributions
- https://demonstrations.wolfram.com/TheBivariateNormalDistribution/
- https://online.stat.psu.edu/stat505/lesson/4/4.2


Test (45 min):
- **Code**: Python list, Python string, Python dictionary
- **Code**: Numpy slicing, numpy broadcast
- **Math**: Gaussian Distribution
- **Math**: Bivariate Gaussian Distribution
- **Misc**: GitHub Pull Request (GitHub workflow)


### Week 4 (Test 2/4)

Materials to read before test:
- all materials for lectures and exercises
- https://docs.github.com/en/actions/quickstart
- https://resources.github.com/ci-cd/
- https://www.khanacademy.org/math/multivariable-calculus/applications-of-multivariable-derivatives/optimizing-multivariable-functions/a/what-is-gradient-descent


Test (45 min):
- **Code**: Linear Regression implementation from scratch
- **Code**: Logistic Regression implementation from scratch
- **Math**: Gradient Descent
- **Math**: MSE for linear regression
- **Math**: cross-entropy loss function for logistic regression
- **Misc**: ssh key, CI/CD, GitHub Actions, Python pickle, confusion matrix


### Week 6 (Test 3/4)
Materials to read before test:
- all materials for lectures and exercises
- The four 3b1b videos [[1]](https://www.youtube.com/watch?v=aircAruvnKk) [[2]](https://www.youtube.com/watch?v=IHZwWFHWa-w) [[3]](https://www.youtube.com/watch?v=Ilg3gGewQ5U) [[4]](https://www.youtube.com/watch?v=tIeHLnjs5U8)


Test (45 min):
- **Code**: Neural network implementation from scratch (1/2) 
    - Code scope limited to [code-nn-from-scratch-test.md](week-5/code-nn-from-scratch-test.md)
    - Some basic starter code (>= 70%) will be provided
    - Explanation of some of the important code, from [code-nn-from-scratch-test.md](week-5/code-nn-from-scratch-test.md). 
- **Math**: softmax function
- **Math**: the chain rule of calculus for univariate and multivariate functions
- **Conceptual Understanding**: universal approximation theorem
- **Conceptual Understanding**: GD v.s. SGD v.s. Adam, batch size
- **Conceptual Understanding**: Test-Driven Programming with GitHub Actions/CI/CD


### Week 8 (Test 4/4)

Test (45 min):
- **Code**: Neural network implementation from scratch (2/2) 
    - Code scope limited to [code-nn-from-scratch-test.md](week-5/code-nn-from-scratch-test.md)
    - Some basic starter code (<= 50%) will be provided
    - Explanation of some of the important code, from [code-nn-from-scratch-test.md](week-5/code-nn-from-scratch-test.md).
- **Math**: [Dropout, inverted v.s. original implementation](week-6/QA-dropout-1-p-multipy-or-devide.md).
- **Conceptual Understanding**: overfitting, underfitting, bias, variance, bias-variance trade-off, L1/L2 regularization, early stopping, etc., basically [all images in the slides of week-6/lecture-model-selection.ipynb](week-6/lecture-model-selection.ipynb).


## Exercise sessions (Thursday)

Most exercises will correspond to lecture topics, with some extensions.

### Week 1

Starting from this session, we will use Jupyter Notebook.

Please install Python, VS Code, and, ideally, you should be able to use Google Colab and GitHub.

Make sure you have a seamless Internet connection to those websites.

Exercise:
- Python
- Numpy
- Pandas

### Week 2

Make sure that you can run our Jupyter Notebooks on VS Code.

Also, make sure you have access to GitHub, Google and YouTube.

Exercise:
- Linear Regression from scratch

### Week 3

Exercise:
- Logistic Regression from scratch    

### Week 4
Exercise:
- Neural Network from scratch  

### Week 5

Exercise:
- Play with those ML apps, and get some ideas for your Project
    - https://www.tensorflow.org/js/demos
    - https://streamlit.io/gallery
    - https://shiny.rstudio.com/gallery/

- PICTH!
    - Present your amazing idea (even if it can be refined later, and should be)
    - Get people to join your team
    - Kick off your projects

## Project

### Kick off

Projects kick off at Week 5, in the code session.

### Forming groups

Each group 4 students.

Forming groups:
- https://docs.qq.com/doc/DT2xqVHphanhGUWpR (Login with WeChat by scanning the QR code)

At most ONE group could have 3 or 5 students, provided that `N_Student % 4 != 0`.

As our final project will be on En-to-CN translation of srt files, Each group of 4 students is expected to include ONE Chinese students. You can do without, provided you can find people to give you feedback on your translation quality.

### What's expected of your video
- Length of video \>= 10 min
- If possible, make it fun (Because life is good). 
- If possible, make it fancy (Because you are young). 
- And yes, your video should be presented in English. 

### Submission of your work

1. Create a folder, in which you put:
    - the video
    - the source code
    - a txt/markdown file indicating 
        - what's the task of each team member
        - the estimated workload/contribution percentage of each team member 
    - a txt/markdown file indicating 
        - the URL of your GitHub repository for hosting your code
        - prof will check the commit history of your GitHub repo to see how each team member is contributing  
1. Zip the folder
1. Upload the zip file to Google Drive
1. Send the sharing link to the prof, by PRIVATE WeChat or by Email
    - Therefore, in the WeChat/Email message, there are no attached files, just an Google Drive URL.

For each team, just one submission of the work is necessary, by one member of your team.

Deadline for submission:
- The First Friday of the 14 days of Exam Weeks of SHU, 23:59.
- For 2025, it's Friday 30/05 23:59.

### Gallery of Final project videos

Bilibili videos:
- https://space.bilibili.com/472463946/lists/1487100?type=season

Be default, you agree to share your videos on Bilibili by the prof as well.

## Score

Denoting your Continuous Assessments score as `T`, your project score as `P`, your final score will be 

```python
0.7 * T + 0.3 * P
```

### Distribution of notes
According to SHU rules, the distribution of notes is as follows:
- 10% A (90-100)
- 20% A- (85-89)
- 30% B (80-84)
- 20% C (75-79)
- 20% D/E/F

Historical failure rates:
- 2021: 10%
- 2023: 10%


#### Zen of Python

https://peps.python.org/pep-0020/

```text
Beautiful is better than ugly.

Explicit is better than implicit.

Simple is better than complex.

Complex is better than complicated.

Flat is better than nested.

Sparse is better than dense.

Readability counts.

Special cases aren't special enough to break the rules.

Although practicality beats purity.

Errors should never pass silently.

Unless explicitly silenced.

In the face of ambiguity, refuse the temptation to guess.

There should be one-- and preferably only one --obvious way to do it.

Although that way may not be obvious at first unless you're Dutch.

Now is better than never.

Although never is often better than *right* now.

If the implementation is hard to explain, it's a bad idea.

If the implementation is easy to explain, it may be a good idea.

Namespaces are one honking great idea -- let's do more of those!
```

## Asking questions :question:

### Leveraging **[Gitee Issue](https://gitee.com/lundechen/cpp/issues)** for asking questions
By default, you should ask questions via **[Gitee Issue](https://gitee.com/lundechen/cpp/issues)**. Here is how:
- https://www.bilibili.com/video/BV1364y1h7sb/

### Principe
Here is the principle for asking questions:

>  **Google/ChatGPT First, Peers Second, Profs Last.**

You are expected to ask questions via **[Gitee Issue](https://gitee.com/lundechen/cpp/issues)**. However, as a **secondary**  (and hence, less desirable, less encouraged) choice, you could also ask questions in the WeChat group.

> Why Gitee Issue? Because it's simply more **professional**, and better in every sense.

In Gitee Issue and the WeChat group, questions will be answered selectively. 

Questions won't be answered if:
- they could be solved on a simple Google search
- they are out of the scope of the course
- they are well in advance of the progress of the course
- professors think that it's not interesting for discussion

### Regarding personal WeChat chats:
- **Questions asked in personal WeChat chats will NOT be answered.**

Learning how to use Google & Baidu & Bing & ChatGTP to solve computer science problems is an important skill you should develop during this course.

For private questions, please send your questions by email to:
- lundechen@shu.edu.cn (Lunde Chen)

### Office visit

Office visit is NOT welcome unless you make an appointment at least one day in advance.

## Student Name List

| 学号/工号    | 姓名  |
| -------- | --- |
| 22124692 | 赵俊凯 | 
| 22124694 | 冯以凡 (休学)| 
| 22124705 | 段轩 | 
| 22124725 | 乔正蓬 | 
| 22124731 | 王嘉浩 | 
| 22124732 | 雷宇杨 | 
| 22124747 | 陈佚 | 
| 22124774 | 杨宇周 | 
| 22124776 | 韦杰 | 
| 22124780 | 刘璐 | 
| 25D61119 | Herbin Mathis |
| 25D61073 | Chiem Romain |
| 25D61022 | Mocquant Henri |
| 25D61098 | Desmaret Mathéo |
| 25D61093 | CORNET Nicolas |
| 25D61118 | Boniface Pierre |
| 25D61034 | Douik Ahmed |
| 25D61086 | Million Corentin |
| 25D61156 | Maxime Boschet |
| 25D61033 | HUWER Paul |
| 25D61069 | Ferrasse--Jamaux Tom |
| 25D61070 | Maillard Camil |
| 25D61089 | Edouard Marchand |
| 25D61071 | CHAUDRY COURPAT Kizzy |
| 25D61125 | Gomes Romain |
| 25D61017 | PEREIRA DE ARAUJO Théo |
| 25D61016 | Morin Loic |
| 25D61130 | Aristide BRUNAUD |
| 25D61136 | LE Rémy |
| 25D61123 | Julie BENED |
| 25D61088 | Lambert Gabin |
| 25D61013 | Fouich Noa |
| 25D61018 | MEDANE Yani |
| 25D61085 | Amaury DUFRENOT |
| 25D61030 | Jean Lagorsse |


## Online resources

1. 吴恩达机器学习系列：
    - https://www.bilibili.com/video/BV164411b7dx
1. 吴恩达深度学习系列：
    - https://www.bilibili.com/video/BV164411m79z

## F.A.Q

#### What characterizes this ML01 machine learning course?
- Stressful, fun and rewarding.

#### Do we have extra-course work?
-  Yes. A lot. 
- At least 5 hours of extra-course work each week is expected from you. 
    - 2 hours for course content revision & test preparation
    - 3 hours for your project (2 is the bare minimum, you might want to shoot up to 20 or 30 towards the end of the trimester).

#### Why this course seems a bit different?
- Well, the prof draws inspirations from courses of Stanford, Berkeley and MIT.
    - http://cs231n.stanford.edu (Stanford)
    - https://c.d2l.ai/berkeley-stat-157 (Berkeley)
    - http://introtodeeplearning.com (MIT)
    - http://cs229.stanford.edu （Stanford）

#### Do we have a slogan?
- Yes. See the picture below &#128071;

![](img/justdoit.png)

## License
This repository is licensed under [MIT](LICENSE).