Dari Perceptron sebagai pemisah linear sederhana menuju MLP yang mampu memodelkan hubungan non-linear melalui beberapa lapisan dan fungsi aktivasi. Kita pahami alur dasar backpropagation dan cara menjaga generalisasi melalui regularisasi & early stopping.
Tujuan: membedakan Perceptron vs MLP, memahami aktivasi & loss, dan melatih MLP kecil yang stabilSesi 14 – Perceptron & Multi-Layer Perceptron
Ulasan Materi: Definisi, Intuisi, dan Contoh
Perceptron
Model linear biner dengan update sederhana. Bagus sebagai pijakan awal, namun gagal pada data non-linear seperti XOR. Visualisasi boundary membantu melihat batas kemampuan model.
MLP & Backprop
MLP menggabungkan linear + aktivasi non-linear berulang. Backprop melatih bobot dengan menurunkan loss melalui rantai turunan. Pada praktik, gunakan optimizer seperti Adam dan fitur early stopping untuk kestabilan.
Ringkasan Intuisi & Rumus:
1) Perceptron (Linear Binary Classifier)
• z = w·x + b; prediksi: y = 1 jika z ≥ 0, else 0.
• Aturan pembaruan (per sampel): w ← w + α (y_true − y_pred) x; b ← b + α (y_true − y_pred).
• Hanya dapat memisahkan data yang linear separable.
2) MLP (Multi-Layer Perceptron)
• Arsitektur: [input] → [hidden (aktivasi)] → ... → [output].
• Aktivasi umum: ReLU, Sigmoid, Tanh.
• Klasifikasi biner: σ(z)=1/(1+e^{−z}); loss: Binary Cross-Entropy.
• Klasifikasi multi-kelas: softmax; loss: Cross-Entropy.
• Regresi: output linear; loss: MSE.
3) Backpropagation (sketsa)
• Hitung forward pass (z,a) per layer; loss L(y,ŷ).
• Hitung ∂L/∂w, ∂L/∂b via rantai turunan; update dengan GD/SGD/Adam.
4) Regularisasi & Generalisasi
• Weight decay (L2, α), dropout, early stopping, batch normalization.
• Overfitting: gap train vs val membesar → gunakan regularisasi & lebih banyak data.
Studi Kasus
- Linear vs Non-Linear: Perceptron (linear) vs MLP pada data yang sama.
- XOR: MLP menuntaskan pola yang gagal di Perceptron.
- Moons: MLP (ReLU) + scaling + early stopping.
Lab: Perceptron, MLP NumPy, MLPClassifier, Visualisasi, Regresi
# ====== Perceptron dari Nol (Klasifikasi Linear) ======
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
import matplotlib.pyplot as plt
np.random.seed(0)
# Data dua kelas separable kira-kira linear
N=200
X1 = np.random.multivariate_normal([0,1], [[0.8,0.2],[0.2,0.8]], N)
X2 = np.random.multivariate_normal([2,3], [[0.8,-0.1],[-0.1,0.8]], N)
X = np.vstack([X1,X2])
y = np.array([0]*N + [1]*N)
Xtr, Xte, ytr, yte = train_test_split(X,y,test_size=0.25,random_state=42)
# tambah bias term (kolom 1)
Xtr_ = np.c_[np.ones(len(Xtr)), Xtr]
Xte_ = np.c_[np.ones(len(Xte)), Xte]
w = np.zeros(Xtr_.shape[1])
alpha=0.1
epochs=30
hist=[]
for ep in range(epochs):
# stochastic update
idx = np.random.permutation(len(Xtr_))
for i in idx:
z = np.dot(w, Xtr_[i])
yhat = 1 if z>=0 else 0
w += alpha * (ytr[i]-yhat) * Xtr_[i]
# evaluasi sederhana per-epoch
pred = (Xte_.dot(w) >= 0).astype(int)
acc = accuracy_score(yte, pred)
hist.append(acc)
if ep%5==0:
print(f"epoch {ep}: acc={acc:.3f}")
print('Akurasi akhir:', hist[-1])
# Visualisasi boundary
xx, yy = np.meshgrid(np.linspace(X[:,0].min()-1, X[:,0].max()+1, 220),
np.linspace(X[:,1].min()-1, X[:,1].max()+1, 220))
ZZ = (w[0] + w[1]*xx + w[2]*yy) >= 0
plt.figure(figsize=(5,4))
plt.contourf(xx,yy,ZZ,alpha=0.2)
plt.scatter(Xte[:,0], Xte[:,1], c=yte, s=16, edgecolor='k')
plt.title('Perceptron Decision Boundary')
plt.show()
# ====== MLP 1 Hidden Layer (NumPy) ======
# Catatan: Implementasi edukatif (bukan produksi)
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
np.random.seed(1)
# Dataset XOR (tidak linear separable)
X = np.array([
[0,0], [0,1], [1,0], [1,1]
], dtype=float)
y = np.array([0,1,1,0]) # label biner
# Perbanyak sampel dengan noise kecil
X = np.repeat(X, 50, axis=0) + np.random.randn(200,2)*0.05
y = np.repeat(y, 50)
Xtr, Xte, ytr, yte = train_test_split(X,y,test_size=0.25,random_state=0)
# Arsitektur: 2 → 8 → 1 (sigmoid)
D, H, O = 2, 8, 1
W1 = 0.1*np.random.randn(D,H); b1 = np.zeros(H)
W2 = 0.1*np.random.randn(H,O); b2 = np.zeros(O)
lr=0.1
epochs=3000
sig = lambda z: 1/(1+np.exp(-z))
def forward(X):
z1 = X.dot(W1) + b1
a1 = np.tanh(z1)
z2 = a1.dot(W2) + b2
a2 = sig(z2) # prob
return z1,a1,z2,a2
for ep in range(epochs):
z1,a1,z2,a2 = forward(Xtr)
# loss = BCE (binary cross-entropy)
eps=1e-8
loss = -np.mean(ytr*np.log(a2+eps) + (1-ytr)*np.log(1-a2+eps))
# backprop
dz2 = (a2 - ytr.reshape(-1,1)) # dL/da2 * da2/dz2
dW2 = a1.T.dot(dz2)/len(Xtr)
db2 = dz2.mean(axis=0)
da1 = dz2.dot(W2.T)
dz1 = (1 - np.tanh(z1)**2) * da1
dW1 = Xtr.T.dot(dz1)/len(Xtr)
db1 = dz1.mean(axis=0)
# update
W2 -= lr*dW2; b2 -= lr*db2
W1 -= lr*dW1; b1 -= lr*db1
if ep%500==0:
pred = (forward(Xte)[3] >= 0.5).astype(int)
acc = accuracy_score(yte, pred)
print(f"ep={ep} loss={loss:.3f} acc_te={acc:.3f}")
pred = (forward(Xte)[3] >= 0.5).astype(int)
print('Akurasi akhir:', accuracy_score(yte, pred))
# ====== MLPClassifier (sklearn) — Klasifikasi ======
# Jika perlu: !pip install scikit-learn
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score, classification_report
X,y = make_moons(n_samples=800, noise=0.25, random_state=0)
Xtr,Xte,ytr,yte = train_test_split(X,y,test_size=0.25,random_state=42)
sc = StandardScaler(); Xtr_=sc.fit_transform(Xtr); Xte_=sc.transform(Xte)
clf = MLPClassifier(hidden_layer_sizes=(32,16), activation='relu', solver='adam',
alpha=1e-3, learning_rate_init=1e-3, max_iter=400, random_state=0,
early_stopping=True, n_iter_no_change=10)
clf.fit(Xtr_, ytr)
yp = clf.predict(Xte_)
print('Akurasi:', accuracy_score(yte, yp))
print(classification_report(yte, yp))
# ====== Visualisasi Boundary untuk MLP ======
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
X,y = make_moons(n_samples=400, noise=0.25, random_state=1)
Xtr,Xte,ytr,yte = train_test_split(X,y,test_size=0.25,random_state=42)
sc = StandardScaler(); Xtr_=sc.fit_transform(Xtr); Xte_=sc.transform(Xte)
clf = MLPClassifier(hidden_layer_sizes=(32,16), activation='relu', solver='adam',
alpha=1e-3, max_iter=500, random_state=0)
clf.fit(Xtr_, ytr)
x_min,x_max = X[:,0].min()-0.5, X[:,0].max()+0.5
y_min,y_max = X[:,1].min()-0.5, X[:,1].max()+0.5
xx,yy = np.meshgrid(np.linspace(x_min,x_max,250), np.linspace(y_min,y_max,250))
Z = clf.predict(sc.transform(np.c_[xx.ravel(), yy.ravel()])).reshape(xx.shape)
plt.figure(figsize=(6,4))
plt.contourf(xx,yy,Z,alpha=0.25)
plt.scatter(Xte[:,0], Xte[:,1], c=yte, s=16, edgecolor='k')
plt.title('Decision Boundary MLP (ReLU)')
plt.show()
# ====== (Opsional) MLPRegressor — Regresi Non-Linear ======
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import mean_squared_error
np.random.seed(2)
X = np.linspace(-3,3,300).reshape(-1,1)
y = np.sin(X[:,0]) + 0.3*np.cos(3*X[:,0]) + np.random.randn(300)*0.1
Xtr,Xte,ytr,yte = train_test_split(X,y,test_size=0.25,random_state=0)
reg = MLPRegressor(hidden_layer_sizes=(64,32), activation='tanh', solver='adam',
alpha=1e-3, learning_rate_init=1e-3, max_iter=800, random_state=0)
reg.fit(Xtr, ytr)
xx = np.linspace(-3,3,400).reshape(-1,1)
plt.figure(figsize=(6,4))
plt.scatter(Xte, yte, s=12, label='Test')
plt.plot(xx, reg.predict(xx), label='MLP fit')
plt.title('MLPRegressor pada Sinyal Non-Linear')
plt.legend(); plt.show()
print('Test MSE:', mean_squared_error(yte, reg.predict(Xte)))
# ====== Eksperimen Regularisasi (alpha, early stopping) ======
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
X,y = make_moons(n_samples=1000, noise=0.3, random_state=0)
Xtr,Xte,ytr,yte = train_test_split(X,y,test_size=0.25,random_state=42)
sc = StandardScaler(); Xtr_=sc.fit_transform(Xtr); Xte_=sc.transform(Xte)
for alpha in [1e-5, 1e-4, 1e-3, 1e-2]:
clf = MLPClassifier(hidden_layer_sizes=(64,64), activation='relu', solver='adam',
alpha=alpha, max_iter=600, random_state=0, early_stopping=True,
n_iter_no_change=15)
clf.fit(Xtr_, ytr)
acc = accuracy_score(yte, clf.predict(Xte_))
print(f'alpha={alpha}: acc={acc:.3f}, epochs={clf.n_iter_}')
# ====== Kuis 14 (cek mandiri) ======
qs=[
("Kapan Perceptron gagal?",{"a":"Saat data tidak linear separable","b":"Saat data normal","c":"Saat fitur > 2","d":"Tidak pernah"},"a"),
("Aktivasi yang umum untuk hidden layer:",{"a":"Linear","b":"ReLU/Tanh","c":"Sigmoid di output saja","d":"Softmax untuk regresi"},"b"),
("Loss yang lazim untuk klasifikasi biner MLP:",{"a":"MSE","b":"MAE","c":"Binary Cross-Entropy","d":"Huber"},"c"),
("Cara mencegah overfitting pada MLP:",{"a":"Tambah hidden hingga overfit","b":"Kurangi data","c":"Regularisasi L2/early stopping/dropout","d":"Hilangkan validasi"},"c"),
]
print('Kunci jawaban:')
for i,(_,__,ans) in enumerate(qs,1):
print(f'Q{i}: {ans}')
Penugasan & Referensi
Tugas Koding 10: Pilih satu dataset klasifikasi nyata (≥ 800 sampel). Bandingkan Perceptron vs MLP (≥ 1 hidden layer). Lakukan scaling, tuning hyperparameter (ukuran hidden, α/L2, learning rate), gunakan early stopping. Laporkan metrik (akurasi, F1) dan sertakan plot decision boundary (jika 2D) atau kurva learning. Ringkas ≤ 1 halaman.
- Goodfellow, Bengio, Courville — Deep Learning (bab MLP & regularisasi).
- Geron — Hands-On Machine Learning (bab neural networks).