看这篇文章需要大家先了解下
卷积、转置卷积与空洞卷积的区别,更有助于理解文中代码以及为何我会这么去实施我的project
1 原理
之前我曾采用传统方法实现了一下车道线检测,
http://www.noobyard.com/article/p-tsohlvro-nt.html
车道线检测是无人车系统里感知模块的重要组成部分。利用视觉算法的车道线检测解决方案是一种较为常见解决方案。视觉检测方案主要基于图像算法,检测出图片中行车道路的车道线标志区域。
高速公路上的车道线检测是一项具有挑战性的任务,由于车道线标志的种类繁多,车辆拥挤造成车道线标志区域被遮挡,车道线可能有腐蚀磨损的情况,以及天气等因素都能给车道线检测任务带来不小的挑战。过去,大部分车道线检测算法基本是通过卷积滤波方法,识别分割出车道线区域,然后结合霍夫变换、RANSAC等算法进行车道线检测,这类算法需要人工手动去调滤波算子,根据算法所针对的街道场景特点手动调节参数,工作量大且鲁棒性较差,当行车环境出现明显变化时,车道线的检测效果不佳。参考我的上一篇博文
https://blog.csdn.net/xiao__run/article/details/82746319
本文基于传统车道线检测算法,结合深度学习技术,提出了一种使用深度神经网络,代替传统算法中手动调滤波算子,对高速公路上的车道线进行语义分割.基本结构就是采用segnet类似结构网络与分割车道线区间,基于fcn, segnet, Unet语义分割网络基本都是encoding,decoding结构,小编在特征选取上做了基本改进,效果还能看.
1 最后一层采用1个3x3的卷积核去回归.
2 采用adam去优化,
3 loss采用loss=‘mean_squared_error’
2 数据集样本
数据集如图所示
分割图
网络model如下:
""" This file contains code for a fully convolutional (i.e. contains zero fully connected layers) neural network for detecting lanes. This version assumes the inputs to be road images in the shape of 80 x 160 x 3 (RGB) with the labels as 80 x 160 x 1 (just the G channel with a re-drawn lane). Note that in order to view a returned image, the predictions is later stacked with zero'ed R and B layers and added back to the initial road image. """python import numpy as np import pickle import cv2 from sklearn.utils import shuffle from sklearn.model_selection import train_test_split # Import necessary items from Keras from keras.models import Model from keras.layers import Activation, Dropout, UpSampling2D, concatenate, Input from keras.layers import Conv2DTranspose, Conv2D, MaxPooling2D from keras.layers.normalization import BatchNormalization from keras.preprocessing.image import ImageDataGenerator from keras.utils import plot_model from keras import regularizers # Load training images train_images = pickle.load(open("full_CNN_train.p", "rb" )) # Load image labels labels = pickle.load(open("full_CNN_labels.p", "rb" )) # Make into arrays as the neural network wants these train_images = np.array(train_images) labels = np.array(labels) # Normalize labels - training images get normalized to start in the network labels = labels / 255 # Shuffle images along with their labels, then split into training/validation sets train_images, labels = shuffle(train_images, labels) # Test size may be 10% or 20% X_train, X_val, y_train, y_val = train_test_split(train_images, labels, test_size=0.1) # Batch size, epochs and pool size below are all paramaters to fiddle with for optimization batch_size = 16 epochs = 10 pool_size = (2, 2) #input_shape = X_train.shape[1:] ### Here is the actual neural network ### # Normalizes incoming inputs. First layer needs the input shape to work #BatchNormalization(input_shape=input_shape) Inputs = Input(batch_shape=(None, 80, 160, 3)) # Below layers were re-named for easier reading of model summary; this not necessary # Conv Layer 1 Conv1 = Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Inputs) Bat1 = BatchNormalization()(Conv1) # Conv Layer 2 Conv2 = Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Conv1) Bat2 = BatchNormalization()(Conv2) # Pooling 1 Pool1 = MaxPooling2D(pool_size=pool_size)(Conv2) # Conv Layer 3 Conv3 = Conv2D(32, (3, 3), padding = 'valid', strides=(1,1), activation = 'relu')(Pool1) #Drop3 = Dropout(0.2)(Conv3) Bat3 = BatchNormalization()(Conv3) # Conv Layer 4 Conv4 = Conv2D(32, (3, 3), padding = 'valid', strides=(1,1), activation = 'relu')(Bat3) #Drop4 = Dropout(0.5)(Conv4) Bat4 = BatchNormalization()(Conv4) # Conv Layer 5 Conv5 = Conv2D(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat4) #Drop5 = Dropout(0.2)(Conv5) Bat5 = BatchNormalization()(Conv5) # Pooling 2 Pool2 = MaxPooling2D(pool_size=pool_size)(Bat5) # Conv Layer 6 Conv6 = Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Pool2) #Drop6 = Dropout(0.2)(Conv6) Bat6 = BatchNormalization()(Conv6) # Conv Layer 7 Conv7 = Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat6) #Drop7 = Dropout(0.2)(Conv7) Bat7 = BatchNormalization()(Conv7) # Pooling 3 Pool3 = MaxPooling2D(pool_size=pool_size)(Bat7) # Conv Layer 8 Conv8 = Conv2D(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Pool3) #Drop8 = Dropout(0.2)(Conv8) Bat8 = BatchNormalization()(Conv8) # Conv Layer 9 Conv9 = Conv2D(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat8) #Drop9 = Dropout(0.2)(Conv9) Bat9 = BatchNormalization()(Conv9) # Pooling 4 Pool4 = MaxPooling2D(pool_size=pool_size)(Bat9) # Upsample 1 Up1 = UpSampling2D(size=pool_size)(Pool4) Mer1 = concatenate([Up1, Bat9], axis=-1) # Deconv 1 Deconv1 = Conv2DTranspose(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer1) #DDrop1 = Dropout(0.2)(Deconv1) DBat1 = BatchNormalization()(Deconv1) # Deconv 2 Deconv2 = Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat1) #DDrop2 = Dropout(0.2)(Deconv2) DBat2 = BatchNormalization()(Deconv2) # Upsample 2 Up2 = UpSampling2D(size=pool_size)(DBat2) Mer2 = concatenate([Up2, Bat7], axis=-1) # Deconv 3 Deconv3 = Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer2) #DDrop3 = Dropout(0.2)(Deconv3) DBat3 = BatchNormalization()(Deconv3) # Deconv 4 Deconv4 = Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat3) #DDrop4 = Dropout(0.2)(Deconv4) DBat4 = BatchNormalization()(Deconv4) # Upsample 3 Up3 = UpSampling2D(size=pool_size)(DBat4) Mer3 = concatenate([Up3, Bat5], axis=-1) # Deconv 5 Deconv5 = Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer3) #DDrop5 = Dropout(0.2)(Deconv5) DBat5 = BatchNormalization()(Deconv5) # Deconv 6 Deconv6 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat5) #DDrop6 = Dropout(0.2)(Deconv6) DBat6 = BatchNormalization()(Deconv6) # Deconv 7 Deconv7 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat6) #DDrop7 = Dropout(0.2)(Deconv7) DBat7 = BatchNormalization()(Deconv7) # Upsample 4 Up4 = UpSampling2D(size=pool_size)(DBat7) Mer4 = concatenate([Up4, Bat2], axis=-1) # Deconv 8 Deconv8 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer4) #DDrop8 = Dropout(0.2)(Deconv8) DBat8 = BatchNormalization()(Deconv8) # Deconv 9 Deconv9 = Conv2DTranspose(8, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat8) #DDrop9 = Dropout(0.2)(Deconv9) DBat9 = BatchNormalization()(Deconv9) # Final layer - only including one channel so 1 filter Final = Conv2DTranspose(1, (3, 3), padding='same', strides=(1,1), activation = 'relu')(DBat9) ### End of network ### model = Model(inputs=Inputs, outputs=Final) # Using a generator to help the model use less data # Channel shifts help with shadows slightly datagen = ImageDataGenerator(channel_shift_range=0.2) datagen.fit(X_train) # Compiling and training the model model.compile(optimizer='Adam', loss='mean_squared_error') model.fit_generator(datagen.flow(X_train, y_train, batch_size=batch_size), steps_per_epoch=len(X_train)/batch_size, epochs=epochs, verbose=1, validation_data=(X_val, y_val)) # Freeze layers since training is done model.trainable = False model.compile(optimizer='Adam', loss='mean_squared_error') # Save model architecture and weights model.save('full_CNN_model.h5') # Show summary of model model.summary() plot_model(model, to_file='model.png')
进一步改进
接下来我不是直接采用unsample,采用转置卷积进行上采样,同样进行一个多尺度特征融合
,训练代码如下:
""" This file contains code for a fully convolutional (i.e. contains zero fully connected layers) neural network for detecting lanes. This version assumes the inputs to be road images in the shape of 80 x 160 x 3 (RGB) with the labels as 80 x 160 x 1 (just the G channel with a re-drawn lane). Note that in order to view a returned image, the predictions is later stacked with zero'ed R and B layers and added back to the initial road image. """ import numpy as np import pickle #import cv2 from sklearn.utils import shuffle from sklearn.model_selection import train_test_split # Import necessary items from Keras from keras.models import Model from keras.layers import Activation, Dropout, UpSampling2D, concatenate, Input from keras.layers import Conv2DTranspose, Conv2D, MaxPooling2D from keras.layers.normalization import BatchNormalization from keras.preprocessing.image import ImageDataGenerator from keras.utils import plot_model from keras import regularizers # Load training images train_images = pickle.load(open("full_CNN_train.p", "rb" )) # Load image labels labels = pickle.load(open("full_CNN_labels.p", "rb" )) # Make into arrays as the neural network wants these train_images = np.array(train_images) labels = np.array(labels) # Normalize labels - training images get normalized to start in the network labels = labels / 255 # Shuffle images along with their labels, then split into training/validation sets train_images, labels = shuffle(train_images, labels) # Test size may be 10% or 20% X_train, X_val, y_train, y_val = train_test_split(train_images, labels, test_size=0.1) # Batch size, epochs and pool size below are all paramaters to fiddle with for optimization batch_size = 16 epochs = 10 pool_size = (2, 2) #input_shape = X_train.shape[1:] ### Here is the actual neural network ### # Normalizes incoming inputs. First layer needs the input shape to work #BatchNormalization(input_shape=input_shape) Inputs = Input(batch_shape=(None, 80, 160, 3)) # Below layers were re-named for easier reading of model summary; this not necessary # Conv Layer 1 Conv1 = Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Inputs) Bat1 = BatchNormalization()(Conv1) # Conv Layer 2 Conv2 = Conv2D(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Conv1) Bat2 = BatchNormalization()(Conv2) # Pooling 1 Pool1 = MaxPooling2D(pool_size=pool_size)(Conv2) # Conv Layer 3 Conv3 = Conv2D(32, (3, 3), padding = 'valid', strides=(1,1), activation = 'relu')(Pool1) #Drop3 = Dropout(0.2)(Conv3) Bat3 = BatchNormalization()(Conv3) # Conv Layer 4 Conv4 = Conv2D(32, (3, 3), padding = 'valid', strides=(1,1), activation = 'relu')(Bat3) #Drop4 = Dropout(0.5)(Conv4) Bat4 = BatchNormalization()(Conv4) # Conv Layer 5 Conv5 = Conv2D(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat4) #Drop5 = Dropout(0.2)(Conv5) Bat5 = BatchNormalization()(Conv5) # Pooling 2 Pool2 = MaxPooling2D(pool_size=pool_size)(Bat5) # Conv Layer 6 Conv6 = Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Pool2) #Drop6 = Dropout(0.2)(Conv6) Bat6 = BatchNormalization()(Conv6) # Conv Layer 7 Conv7 = Conv2D(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat6) #Drop7 = Dropout(0.2)(Conv7) Bat7 = BatchNormalization()(Conv7) # Pooling 3 Pool3 = MaxPooling2D(pool_size=pool_size)(Bat7) # Conv Layer 8 Conv8 = Conv2D(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Pool3) #Drop8 = Dropout(0.2)(Conv8) Bat8 = BatchNormalization()(Conv8) # Conv Layer 9 Conv9 = Conv2D(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Bat8) #Drop9 = Dropout(0.2)(Conv9) Bat9 = BatchNormalization()(Conv9) # Pooling 4 Pool4 = MaxPooling2D(pool_size=pool_size)(Bat9) # Upsample 1 to Deconv 1 Deconv1 = Conv2DTranspose(128, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(Pool4) #Up1 = UpSampling2D(size=pool_size)(Pool4) Mer1 = concatenate([Deconv1, Bat9], axis=-1) # Deconv 2 Deconv2 = Conv2DTranspose(128, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer1) DBat2 = BatchNormalization()(Deconv2) # Deconv 3 Deconv3 = Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat2) DBat3 = BatchNormalization()(Deconv3) # Upsample 2 to Deconv 4 Deconv4 = Conv2DTranspose(64, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(DBat3) #Up2 = UpSampling2D(size=pool_size)(DBat2) Mer2 = concatenate([Deconv4, Bat7], axis=-1) # Deconv 5 Deconv5 = Conv2DTranspose(64, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer2) DBat5 = BatchNormalization()(Deconv5) # Deconv 6 Deconv6 = Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat5) DBat6 = BatchNormalization()(Deconv6) # Upsample 3 to Deconv 7 Deconv7 = Conv2DTranspose(32, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(DBat6) #Up3 = UpSampling2D(size=pool_size)(DBat4) Mer3 = concatenate([Deconv7, Bat5], axis=-1) # Deconv 8 Deconv8 = Conv2DTranspose(32, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer3) DBat8 = BatchNormalization()(Deconv8) # Deconv 9 Deconv9 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat8) DBat9 = BatchNormalization()(Deconv9) # Deconv 10 Deconv10 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat9) DBat10 = BatchNormalization()(Deconv10) # Upsample 4 to Deconv 11 Deconv11 = Conv2DTranspose(16, (2, 2), padding='valid', strides=(2,2), activation = 'relu')(DBat10) #Up4 = UpSampling2D(size=pool_size)(DBat7) Mer4 = concatenate([Deconv11, Bat2], axis=-1) # Deconv 12 Deconv12 = Conv2DTranspose(16, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(Mer4) DBat12 = BatchNormalization()(Deconv12) # Deconv 13 Deconv13 = Conv2DTranspose(8, (3, 3), padding='valid', strides=(1,1), activation = 'relu')(DBat12) DBat13 = BatchNormalization()(Deconv13) # Final layer - only including one channel so 1 filter Final = Conv2DTranspose(1, (3, 3), padding='same', strides=(1,1), activation = 'relu')(DBat13) ### End of network ### model = Model(inputs=Inputs, outputs=Final) # Using a generator to help the model use less data # Channel shifts help with shadows slightly datagen = ImageDataGenerator(channel_shift_range=0.2) datagen.fit(X_train) # Compiling and training the model model.compile(optimizer='Adam', loss='mean_squared_error') model.fit_generator(datagen.flow(X_train, y_train, batch_size=batch_size), steps_per_epoch=len(X_train)/batch_size, epochs=epochs, verbose=1, validation_data=(X_val, y_val)) # Freeze layers since training is done model.trainable = False model.compile(optimizer='Adam', loss='mean_squared_error') # Save model architecture and weights model.save('full_CNN_model.h5') # Show summary of model model.summary() plot_model(model, to_file='model.png')
测试
训练得到model之后,接下来我们用视频进行测试一下:
测试时采用连续5帧做的一个平均
import numpy as np import cv2 from scipy.misc import imresize from moviepy.editor import VideoFileClip from IPython.display import HTML from keras.models import load_model import matplotlib.pyplot as plt # Load Keras model model = load_model('full_CNN_model.h5') # Class to average lanes with class Lanes(): def __init__(self): self.recent_fit = [] self.avg_fit = [] def road_lines(image): """ Takes in a road image, re-sizes for the model, predicts the lane to be drawn from the model in G color, recreates an RGB image of a lane and merges with the original road image. """ # Get image ready for feeding into model small_img = imresize(image, (80, 160, 3)) small_img = np.array(small_img) small_img = small_img[None,:,:,:] # Make prediction with neural network (un-normalize value by multiplying by 255) prediction = model.predict(small_img)[0] * 255 # Add lane prediction to list for averaging lanes.recent_fit.append(prediction) # Only using last five for average if len(lanes.recent_fit) > 5: lanes.recent_fit = lanes.recent_fit[1:] # Calculate average detection lanes.avg_fit = np.mean(np.array([i for i in lanes.recent_fit]), axis = 0) # Generate fake R & B color dimensions, stack with G blanks = np.zeros_like(lanes.avg_fit).astype(np.uint8) lane_drawn = np.dstack((blanks, lanes.avg_fit, blanks)) # Re-size to match the original image lane_image = imresize(lane_drawn, (720, 1280, 3)) #plt.imshow(lane_image) #plt.show() # Merge the lane drawing onto the original image result = cv2.addWeighted(image, 1, lane_image, 1, 0) return result lanes = Lanes() # Where to save the output video vid_output = 'harder_challenge.mp4' clip1 = VideoFileClip("project_video.mp4") vid_clip = clip1.fl_image(road_lines) vid_clip.write_videofile(vid_output, audio=False)
结果
结果如图所示:
效果勉强还能看,后续我将会做进一步改进,例如使用E-Net网络加速,采用lanenet实例分割,聚类等算法训练图森的数据集,得到更加好的效果