Tensorflow高性能数据优化增强工具Pipeline使用详解

ronghuaiyang 2022-11-02 我要评论

安装方法

给大家介绍一个非常好用的TensorFlow数据pipeline工具。

高性能的Tensorflow Data Pipeline，使用SOTA的增强和底层优化。

pip install tensorflow-addons==0.11.2
pip install tensorflow==2.2.0
pip install sklearn

功能

High Performance tf.data pipline
Core tensorflow support for high performance
Classification data support
Bbox data support
Keypoints data support
Segmentation data support
GridMask in core tf2.x
Mosiac Augmentation in core tf2.x
CutOut in core tf2.x
Flexible and easy configuration
Gin-config support

高级用户部分

用例1，为训练创建数据Pipeline

from pipe import Funnel                                                         
from bunch import Bunch                                                         
"""                                                                             
Create a Funnel for the Pipeline!                                               
"""                                                                             
# Config for Funnel
config = {                                                                      
    "batch_size": 2,                                                            
    "image_size": [512,512],                                                    
    "transformations": {                                                        
        "flip_left_right": None,                                                
        "gridmask": None,                                                       
        "random_rotate":None,                                                   
    },                                                                          
    "categorical_encoding":"labelencoder"                                       
}                                                                               
config = Bunch(config)                                                          
pipeline = Funnel(data_path="testdata", config=config, datatype="categorical")  
pipeline = pipeline.dataset(type="train")                                       
# Pipline ready to use, iter over it to use.
# Custom loop example.
for data in pipeline:
    image_batch , label_batch = data[0], data[1]
    # you can use _loss = loss(label_batch,model.predict(image_batch))
    # calculate gradients on loss and optimize the model.
    print(image_batch,label_batch)

用例2，为验证创建数据Pipeline

from pipe import Funnel                                                         
from bunch import Bunch                                                         
"""                                                                             
Create a Funnel for the Pipeline!                                               
"""                                                                             
# Config for Funnel
config = {                                                                      
    "batch_size": 1,                                                            
    "image_size": [512,512],                                                    
    "transformations": {                                                                                                       
    },                                                                          
    "categorical_encoding":"labelencoder"                                       
}                                                                               
config = Bunch(config)                                                          
pipeline = Funnel(data_path="testdata", config=config, datatype="categorical", training=False)  
pipeline = pipeline.dataset(type="val")                                       
# use pipeline to validate your data on model.
loss = []
for data in pipeline:
    image_batch , actual_label_batch = data[0], data[1]
    # pred_label_batch = model.predict(image_batch)
    # loss.append(calc_loss(actual_label_batch,pred_label_batch))
    print(image_batch,label_batch)

初学者部分

Keras 兼容性

使用keras model.fit来构建非常简单的pipeline。

import tensorflow as tf
from pipe import Funnel
"""
Create a Funnel for the Pipeline!
"""
config = {
    "batch_size": 2,
    "image_size": [100, 100],
    "transformations": {
        "flip_left_right": None,
        "gridmask": None,
        "random_rotate": None,
    },
    "categorical_encoding": "labelencoder",
}
pipeline = Funnel(data_path="testdata", config=config, datatype="categorical")
pipeline = pipeline.dataset(type="train")
# Create Keras model
model = tf.keras.applications.VGG16(
    include_top=True, weights=None,input_shape=(100,100,3),
    pooling=None, classes=2, classifier_activation='sigmoid'
)
# compile
model.compile(loss='mse', optimizer='adam')
# pass pipeline as iterable
model.fit(pipeline , batch_size=2,steps_per_epoch=5,verbose=1)

配置

image_size - pipeline的图像尺寸。
batch_size - pipeline的Batch size。
transformations - 应用数据增强字典中的对应关键字。
categorical_encoding - 对类别数据进行编码 - ('labelencoder' , 'onehotencoder').

增强：

GridMask

在输入图像上创建gridmask，并在范围内定义旋转。

参数：

ratio - 空间上的网格比例

fill - 填充值fill value

rotate - 旋转的角度范围

MixUp

使用给定的alpha值，将两个随机采样的图像和标签进行混合。

参数：

alpha - 在混合时使用的值。

RandomErase

在给定的图像上的随机位置擦除一个随机的矩形区域。

参数：

prob - 在图像上进行随机的概率。

CutMix

在给定图像上对另一个随机采样的图像进行随机的缩放，再以完全覆盖的方式贴到这个给定图像上。

params：

prob - 在图像上进行CutMix的概率。

Mosaic

把4张输入图像组成一张马赛克图像。

参数：

prob - 进行Mosaic的概率。

Tensorflow高性能数据优化增强工具Pipeline使用详解

安装方法

功能

高级用户部分

用例1，为训练创建数据Pipeline

用例2，为验证创建数据Pipeline

初学者部分

Keras 兼容性

配置

增强：

GridMask

MixUp

RandomErase

CutMix

Mosaic

CutMix , CutOut, MixUp

Mosaic

Grid Mask

相关文章

猜您喜欢

今日热门