Cluster 12 -- Introduction to Machine Learning



Machine learning (ML), a subset of Artificial Intelligence, is considered a disruptive technology with a wide range of applications in various domains, such as healthcare, agriculture and the environment, robotics, and automation, among other successful applications in our daily life. In recent years, machine learning has substantially helped various enterprises, researchers, and policymakers make data-driven decisions based on historical observations and data. 

This cluster introduces the fundamentals of machine learning through interactive lectures, class discussions, and hands-on coding projects. Examples of learning modules include exploratory data analysis, supervised and unsupervised learning, model performance evaluation, and deep learning. The hands-on lab activities and projects allow students to learn and practice developing predictive models in Python, from pre-processing, training, and testing models to comparing models' performance using appropriate metrics.

The lab activities are designed so that students can gain hands-on experience in applying what they have learned in the class to real data. For example, one lab activity can train a deep model that recognizes the objects in an image captured by the student’s smartphone. In discussion sessions, we explore applications of ML in various domains and discuss its social benefits and potential concerns. 

This course aims to help students learn the basic knowledge and skills to establish a foundation required for further exploration of ML techniques with more depth by self-study and/or by choosing a related major in college. 

Sample Code for potential students: 

Cluster 12: Introduction to Machine Learning  

If you can understand the code below, you probably meet the programming requirement for this course :)
In this code, given N vectors (A), the goal is to find the closest vector to a query vector v in terms of Euclidean distance. 

import numpy as np

def dist(v, u): 
    '''computes the Euclidean distance of 2 vectors'''
    # Make sure the length of vectors are equal is greater than 0
    assert len(v) == len(u) and len(v) > 0
    sum_of_squares = 0
    for i in range(len(v)):
        # distance += (v[i] - u[i]) ** 2
        sum_of_squares +=  (v[i] - u[i])**2   
    distance = np.sqrt(sum_of_squares) # Euclidean distance
    return distance 

def nearest_neighbor(A, v):
    '''finds the vector in A with the minumum Euclidean distance to vector v'''
    D = len(v) # get D
    N = len(A) # get N
    for u in A: # check all vectors in our set to be of D-dimension
        assert len(u) == D
    ind = 0
    for i, u in enumerate(A):
        if dist(u, v) < dist(A[ind], v): # if A[i] is closer to v than A[ind], set ind <- i
            ind = i
            min_distance = dist(A[ind], v) # nearest neighbor with minimum Euclidean distance
    print('ind: {}, A[ind]: {}, distance: {}'.format(ind, A[ind], min_distance))
    return ind, A[ind], min_distance

# example in 2D

A = [[1, 1], [1, 2], [3 ,1]] # this is our list with three vectors 
v = [3, 2] # this is our query vector 

# now, we want to find which vector in A is closest to our query vector v in terms of Euclidean distance 

ind, nearest_vector, min_distance = nearest_neighbor(A, v)