给大家介绍一个非常好用的TensorFlow数据pipeline工具。
高性能的Tensorflow Data Pipeline,使用SOTA的增强和底层优化。
pip install tensorflow-addons==0.11.2 pip install tensorflow==2.2.0 pip install sklearn
from pipe import Funnel from bunch import Bunch """ Create a Funnel for the Pipeline! """ # Config for Funnel config = { "batch_size": 2, "image_size": [512,512], "transformations": { "flip_left_right": None, "gridmask": None, "random_rotate":None, }, "categorical_encoding":"labelencoder" } config = Bunch(config) pipeline = Funnel(data_path="testdata", config=config, datatype="categorical") pipeline = pipeline.dataset(type="train") # Pipline ready to use, iter over it to use. # Custom loop example. for data in pipeline: image_batch , label_batch = data[0], data[1] # you can use _loss = loss(label_batch,model.predict(image_batch)) # calculate gradients on loss and optimize the model. print(image_batch,label_batch)
from pipe import Funnel from bunch import Bunch """ Create a Funnel for the Pipeline! """ # Config for Funnel config = { "batch_size": 1, "image_size": [512,512], "transformations": { }, "categorical_encoding":"labelencoder" } config = Bunch(config) pipeline = Funnel(data_path="testdata", config=config, datatype="categorical", training=False) pipeline = pipeline.dataset(type="val") # use pipeline to validate your data on model. loss = [] for data in pipeline: image_batch , actual_label_batch = data[0], data[1] # pred_label_batch = model.predict(image_batch) # loss.append(calc_loss(actual_label_batch,pred_label_batch)) print(image_batch,label_batch)
使用keras model.fit来构建非常简单的pipeline。
import tensorflow as tf from pipe import Funnel """ Create a Funnel for the Pipeline! """ config = { "batch_size": 2, "image_size": [100, 100], "transformations": { "flip_left_right": None, "gridmask": None, "random_rotate": None, }, "categorical_encoding": "labelencoder", } pipeline = Funnel(data_path="testdata", config=config, datatype="categorical") pipeline = pipeline.dataset(type="train") # Create Keras model model = tf.keras.applications.VGG16( include_top=True, weights=None,input_shape=(100,100,3), pooling=None, classes=2, classifier_activation='sigmoid' ) # compile model.compile(loss='mse', optimizer='adam') # pass pipeline as iterable model.fit(pipeline , batch_size=2,steps_per_epoch=5,verbose=1)
在输入图像上创建gridmask,并在范围内定义旋转。
参数:
ratio - 空间上的网格比例
fill - 填充值fill value
rotate - 旋转的角度范围
使用给定的alpha值,将两个随机采样的图像和标签进行混合。
参数:
alpha - 在混合时使用的值。
在给定的图像上的随机位置擦除一个随机的矩形区域。
参数:
prob - 在图像上进行随机的概率。
在给定图像上对另一个随机采样的图像进行随机的缩放,再以完全覆盖的方式贴到这个给定图像上。
params:
prob - 在图像上进行CutMix的概率。
把4张输入图像组成一张马赛克图像。
参数:
prob - 进行Mosaic的概率。