新闻资讯

新闻资讯 媒体报道

DeepFashion实现服装检测搭配

编辑:005     时间:2020-10-28

近年来,服装等商品的搭配、推荐受到了广泛的关注,并在基于视觉的推荐问题中取得了一定的成果。但是,目前工作对于商品的表征,往往是在一个通用的视觉特征空间中,比如CNN (Convolutional Neural Networks)网络的输出层特征。这样的视觉特征表示,对商品的类别比较敏感,却难以建模商品的不同风格。

这样的视觉特征表示很难有效地用于推荐系统中,因为相似风格的商品往往会被同一个人同时购买,但在视觉特征空间中却并不相似,这就为提升推荐效果带来了难度。而在论文DeepFashion: Powering Robust Clothes Recognition and Retrieval withRich Annotations (CVPR 2016)中提出的基于FashionNet实现的服装关键点检测恰好解决了这个问题。


实验前的准备

首先我们使用的python版本是3.6.5所用到的模块如下:

  • opencv是将用来进行图像处理和图片保存读取等操作。

  • numpy模块用来处理矩阵数据的运算。

  • Tensorflow-gpu模块是常用的用来搭建模型和训练的深度学习框架,通过调用GPU达到加速的效果。

  • scikit-learn是python中常见的机器学习集成库。

  • PIL库可以完成对图像进行批处理、生成图像预览、图像格式转换和图像处理操作,包括图像基本处理、像素处理、颜色处理等。

网络模型的定义和训练

FashionNet的前向计算过程总共分为三个阶段:第一个阶段,将一张衣服图片输入到网络中的蓝色分支,去预测衣服的关键点是否可见和位置。第二个阶段,根据在上一步预测的关键点位置,关键点池化层(landmark pooling layer)得到衣服的局部特征。第三个阶段,将“fc6 global”层的全局特征和“fc6 local”的局部特征拼接在一起组成“fc7_fusion”,作为最终的图像特征。FashionNet引入了四种损失函数,并采用一种迭代训练的方式去优化。这些损失分别为:回归损失对应于关键点定位,softmax损失对应于关键点是否可见和衣服类别,交叉熵损失函数对应属性预测和三元组损失函数对应于衣服之间的相似度学习。作者分别从衣服分类,属性预测和衣服搜索这三个方面,将FashionNet与其他方法相比较,都取得了明显更好的效果。

(1)网络层的定义:包括优化器,分类器,网络神经元定义等。具体代码如下:

def create_model(is_input_bottleneck, is_load_weights, input_shape, output_classes, optimizer='Adagrad', learn_rate=None, decay=0.0, momentum=0.0, activation='relu', dropout_rate=0.5):     logging.debug('input_shape {}'.format(input_shape))
    logging.debug('input_shape {}'.format(type(input_shape)))
    # Optimizer     optimizer, learn_rate = get_optimizer(optimizer, learn_rate, decay, momentum)
    # Train     if is_input_bottleneck is True:
        model_inputs = Input(shape=(input_shape))
        common_inputs = model_inputs
    # Predict     else:                                                                                               #input_shape = (img_width, img_height, 3)         base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=input_shape)
        #base_model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_shape=input_shape)         logging.debug('base_model inputs {}'.format(base_model.input))                                  # shape=(?, 224, 224, 3)         logging.debug('base_model outputs {}'.format(base_model.output))                                # shape=(?, 7, 7, 512)         model_inputs = base_model.input
        common_inputs = base_model.output
    ## Model Classification     x = Flatten()(common_inputs)
    x = Dense(256, activation='tanh')(x)
    x = Dropout(dropout_rate)(x)
    predictions_class = Dense(output_classes, activation='softmax', name='predictions_class')(x)
    ## Model (Regression) IOU score     x = Flatten()(common_inputs)
    x = Dense(256, activation='tanh')(x)
    x = Dropout(dropout_rate)(x)
    x = Dense(256, activation='tanh')(x)
    x = Dropout(dropout_rate)(x)
    predictions_iou = Dense(1, activation='sigmoid', name='predictions_iou')(x)
    ## Create Model     model = Model(inputs=model_inputs, outputs=[predictions_class, predictions_iou])
    # logging.debug('model summary {}'.format(model.summary()))     ## Load weights     if is_load_weights is True:
        model.load_weights(top_model_weights_path_load, by_name=True)
    ## Compile     model.compile(optimizer=optimizer,
                  loss={'predictions_class''sparse_categorical_crossentropy''predictions_iou''mean_squared_error'}, metrics=['accuracy'],
                  loss_weights={'predictions_class': predictions_class_weight, 'predictions_iou': predictions_iou_weight})
    logging.info('optimizer:{}  learn_rate:{}  decay:{}  momentum:{}  activation:{}  dropout_rate:{}'.format(
        optimizer, learn_rate, decay, momentum, activation, dropout_rate)) return model

(2)模型的初始化:

def init():

    global batch_size

    batch_size = batch_size_train

    logging.debug('batch_size{}'.format(batch_size))

    global class_names

    class_names =sorted(get_subdir_list(dataset_train_path))

    logging.debug('class_names{}'.format(class_names))

    global input_shape

    input_shape = (img_width,img_height, img_channel)

    logging.debug('input_shape{}'.format(input_shape))

    if notos.path.exists(output_path_name):

       os.makedirs(output_path_name)

    if notos.path.exists(logs_path_name):

        os.makedirs(logs_path_name)

    if not os.path.exists(btl_path):

        os.makedirs(btl_path)

    if not os.path.exists(btl_train_path):

        os.makedirs(btl_train_path)

    if notos.path.exists(btl_val_path):

        os.makedirs(btl_val_path)

(3)bottleneck文件的保存bottleneck结构就是为了降低参数量,Bottleneck 三步走是先用PW对数据进行降维,再进行常规卷积核的卷积,最后PW对数据进行升维(类似于沙漏型)。

def save_bottleneck():

    logging.debug('class_names{}'.format(class_names))

    logging.debug('batch_size{}'.format(batch_size))

    logging.debug('epochs{}'.format(epochs))

    logging.debug('input_shape{}'.format(input_shape))

    ## Build the VGG16 network

    model =applications.VGG16(include_top=False, weights='imagenet',input_shape=input_shape)

    #model =applications.inception_v3.InceptionV3(include_top=False, weights='imagenet',input_shape=input_shape)

    for train_val in ['train','validation']:

       with open('bottleneck/btl_' +train_val + '.txt''w') as f_image:

            for class_name inclass_names:

               dataset_train_class_path = os.path.join(dataset_path, train_val,class_name)

                logging.debug('dataset_train_class_path{}'.format(dataset_train_class_path))

                images_list = []

                images_name_list =[]

                images_path_name =sorted(glob.glob(dataset_train_class_path + '/*.jpg'))

                logging.debug('images_path_name{}'.format(len(images_path_name)))

                for index, image inenumerate(images_path_name):

                    #logging.debug('image {}'.format(image))

                    img =Image.open(image)

                    img = preprocess_image(img)

                   current_batch_size = len(images_list)

                    #logging.debug('current_batch_size {}'.format(current_batch_size))

                   images_list.append(img)

                    image_name = image.split('/')[-1].split('.jpg')[0]

                   images_name_list.append(image)

                    images_list_arr= np.array(images_list)

                    # TODO: Skippingn last images of a class which do not sum up to batch_size

                    if(current_batch_size < batch_size-1):

                        continue

                    X =images_list_arr

                    bottleneck_features_train_class= model.predict(X, batch_size)

                    #bottleneck_features_train_class = model.predict(X, nb_train_class_samples //batch_size)

                    ## Savebottleneck file

                   btl_save_file_name = btl_path + train_val + '/btl_' + train_val + '_' +class_name + '.' + str(index).zfill(7) + '.npy'                    logging.info('btl_save_file_name {}'.format(btl_save_file_name))

                   np.save(open(btl_save_file_name, 'w'), bottleneck_features_train_class)

                    for name inimages_name_list:

                       f_image.write(str(name) + '\n')

                    images_list = []

                    images_name_list= []
(4)模型的训练:读入搭建好的网络层和使用bottleneck files去创建验证集
def train_model():

    ## Build network     model =applications.VGG16(include_top=False, weights='imagenet',input_shape=input_shape)

    #model =applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_shape=input_shape)     # Get sorted bottleneck filenames in a list     btl_train_names =sorted(glob.glob(btl_train_path + '/*.npy'))

    btl_val_names =sorted(glob.glob(btl_val_path + '/*.npy'))

    ## Train Labels     btl_train_list = []

    train_labels_class = []

    train_labels_iou = []

    # Load bottleneckfiles to create validation set     val_data = []

    model = create_model(True,False, input_shape_btl_layer, len(class_names), optimizer, learn_rate, decay,momentum, activation, dropout_rate)

    logging.info('train_labels_iou{}'.format(train_labels_iou.shape))

    logging.info('train_labels_class{}'.format(train_labels_class.shape))

    logging.info('train_data{}'.format(train_data.shape))

    logging.info('val_labels_iou{}'.format(val_labels_iou.shape))

    logging.info('val_labels_class{}'.format(val_labels_class.shape))

    logging.info('val_data{}'.format(val_data.shape))

    TODO: class_weight_val wrong     model.fit(train_data,[train_labels_class, train_labels_iou],

           class_weight=[class_weight_val, class_weight_val],                                      #dictionary mapping classes to a weight value, used for scaling the loss function(during training only).             epochs=epochs,

            batch_size=batch_size,

           validation_data=(val_data, [val_labels_class, val_labels_iou]),

           callbacks=callbacks_list)

    TODO: These are not the bestweights    model.save_weights(top_model_weights_path_save)

模型的使用


(1)根据模型特征分割图片:将其中不同的部位进行分割成不同的图片块

def selective_search_bbox(image):

    logging.debug('image{}'.format(image))

    # load image

    img = skimage.io.imread(image)

    #img = Image.open(image)

    width, height, channels =img.shape

    logging.debug('img {}'.format(img.shape))

    logging.debug('img{}'.format(type(img)))

    region_pixels_threshold =(width*height)/100    logging.debug('region_pixels_threshold{}'.format(region_pixels_threshold))

    # perform selective search

    img_lbl, regions = selectivesearch.selective_search(img,scale=500, sigma=0.9, min_size=10)

    #img_lbl, regions =selectivesearch.selective_search(img)

    # logging.debug('regions{}'.format(regions))

    logging.debug('regions{}'.format(len(regions)))

    candidates = set()

    for r in regions:

        # distorted rects

        x, y, w, h = r['rect']

        # excluding same rectangle(with different segments)

        if r['rect'in candidates:

            continue

        # # excluding regionssmaller than 2000 pixels

        if r['size'] < region_pixels_threshold:

           logging.debug('Discarding - region_pixels_threshold - {} < {} - x:{}y:{} w:{} h:{}'.format(region_pixels_threshold, r['size'], x, y, w, h))

            continue

        # # Orig

        # if w / h > 1.2 or h / w> 1.2:

        #     continue

        if h != 0 and w / h > 6:

           logging.debug('Discarding w/h {} - x:{} y:{} w:{} h:{}'.format(w/h, x,y, w, h))

            continue

        if w != 0 and h / w > 6:

            logging.debug('Discardingh/w {} - x:{} y:{} w:{} h:{}'.format(h/w, x, y, w, h))

            continue

        candidates.add(r['rect'])

(2)模型的预测:其中包括模型的初始化,图片的读入和模型的加载与可视化显示的实现

def init():

    global batch_size

    batch_size = batch_size_predict

    logging.debug('batch_size{}'.format(batch_size))

    global input_shape

    input_shape = (img_width,img_height, img_channel)

    logging.debug('input_shape{}'.format(input_shape))

    global class_names

    # TODO: Remove hardcoding ifdataset available

    class_names = ['Anorak','Bomber''Button-Down''Capris''Chinos''Coat''Flannel''Hoodie','Jeans''Jeggings''Jersey''Kaftan''Parka''Peacoat''Poncho''Robe','Sweatshorts''Trunks''Turtleneck']

    #class_names =get_subdir_list(dataset_train_path)

    logging.debug('class_names{}'.format(class_names))

def get_images():

    images_path_name =sorted(glob.glob(prediction_dataset_path + '/*.jpg'))

    #logging.debug('images_path_name {}'.format(images_path_name))

    return images_path_name

def get_bbox(images_path_name):

    # TODO: Currently for 1 imageonly

    for index, image inenumerate(images_path_name):

        bboxes =selective_search_bbox(image)

        logging.debug('bboxes {}'.format(bboxes))

        return bboxes

#model = create_model_predict((input_shape), optimizer, learn_rate, decay,momentum, activation, dropout_rate)

    model = create_model(False,True, input_shape, len(class_names), optimizer, learn_rate, decay, momentum,activation, dropout_rate)

    images_list = []

    images_name_list = []

    images_name_list2 = []

    prediction_class = []

    prediction_iou = []

    prediction_class_prob = []

    prediction_class_name = []

    ## Folder

    prediction_dataset_path='dataset_prediction/images/'     #images_path_name =sorted(glob.glob(prediction_dataset_path + '/*.jpg'))

    #for image in images_path_name:

    for index, image inenumerate(images_names):

       logging.debug('\n\n++++++++++++++++++++++++++++++++++++++++')

        image_path_name =prediction_dataset_path + image

       logging.debug('image_path_name {}'.format(image_path_name))

        img =Image.open(image_path_name)

        logging.debug('img{}'.format(img))

        logging.debug('img len{}'.format((img.size)))

        #img.save('output/a' +str(index) + '.jpg')

        img = preprocess_image(img)

        img = np.expand_dims(img, 0)

        prediction =model.predict(img, batch_size, verbose=1)

        # logging.debug('prediction{}'.format(prediction))

       prediction_class_=prediction[0][0]

        #logging.debug('prediction_class_ {}'.format(prediction_class_))

       prediction_class.append(prediction_class_)

        prediction_iou_ =prediction[1][0][0]

        logging.debug('prediction_iou_{}'.format(prediction_iou_))

       prediction_iou.append(prediction_iou_)

        prediction_class_index =np.argmax(prediction[0])

       logging.debug('prediction_class_index{}'.format(prediction_class_index))

        prediction_class_prob_ =prediction[0][0][prediction_class_index]

       logging.debug('prediction_class_prob_{}'.format(prediction_class_prob_))

       prediction_class_prob.append(prediction_class_prob_)

        prediction_class_name_ =class_names[prediction_class_index]

       logging.debug('prediction_class_name_{}'.format(prediction_class_name_))

       prediction_class_name.append(prediction_class_name_)

        images_list.append(img)

        images_name_list.append(image_path_name)

    #logging.debug('prediction_class {}'.format(prediction_class))

    logging.debug('prediction_iou{}'.format(prediction_iou))

   logging.debug('prediction_class_prob {}'.format(prediction_class_prob))

    logging.debug('prediction_class_name{}'.format(prediction_class_name))

    #logging.debug('images_name_list {}'.format(images_name_list))

    bboxes = []

    for image_path_name inimages_name_list:

        bbox_=image_path_name.split('/')[-1].split('.jpg')[0].split('-')[1]

        x = int(bbox_.split('_')[0])

        y = int(bbox_.split('_')[1])

        w = int(bbox_.split('_')[2])

        h = int(bbox_.split('_')[3])

        bbox = (x, y, w, h)

        bboxes.append(bbox)

    bboxes = set(bboxes)

    logging.debug('bboxes{}'.format(bboxes))

    #orig_image_path_name =['dataset_prediction/images/img_00000061.jpg']

    #orig_image_path_name =['dataset_prediction/images2/shahida-parides-floral-v-neckline-long-kaftan-dress.jpg']

    orig_image_path_name =sorted(glob.glob('dataset_prediction/images' + '/*.jpg'))

   logging.debug('orig_image_path_name {}'.format(orig_image_path_name))

   display_bbox(orig_image_path_name, bboxes, prediction_class_name,prediction_class_prob, prediction_iou, images_name_list)

    logging.debug('images_list{}'.format(len(images_list)))

    images_list_arr =np.array(images_list)

    logging.debug('images_list_arrtype {}'.format(type(images_list_arr)))

    prediction =model.predict(images_list_arr, batch_size, verbose=1)

    #prediction =model.predict(predict_data, batch_size, verbose=1)

    # logging.debug('\n\nprediction\n{}'.format(prediction))

    logging.debug('prediction shape{} {}'.format(len(prediction), len(prediction[0])))

    print('')

    for index,preds inenumerate(prediction):

        for index2, pred inenumerate(preds):

            #print('images_name_listindex2 : {:110}    '.format(images_name_list[index2]), end='')

            #print('\n')

            print('images_name_listindex2 : {:110}    '.format(images_name_list2[index2]), end='')

            for p in pred:

                print('{:8f}'.format(float(p)), end='')

            print('')

        print('')

效果如下图所示:


本内容属于网络转载,文中涉及图片等内容如有侵权,请联系编辑删除

郑重声明:本文版权归原作者所有,转载文章仅为传播更多信息之目的,如作者信息标记有误,请第一时间联系我们修改或删除,多谢。

回复列表

相关推荐