The Fashion MNIST Datasets contain a set of 28×28 grayscale images of clotes. Our goal is building a neural network using Pytorch and then training the network to predict clothes. 84% max. First python Without REFACTOR. Third is refactored.

This is the refactor code in a gist file

I currently have an accuracy of 81%, how could I try to have better efficiency of the trained model?

Here are some suggestions that might help you improve the accuracy of your model:

  • Uses a larger dataset. The Fashion-MNIST dataset is relatively small, with only 70,000 images. A larger data set will give the model more data to learn from, which could lead to better accuracy. You can use the Fashion-MNIST training data set twice, or you can use a larger data set, such as the MNIST data set.
  • Uses a more complex network architecture. The current network architecture is relatively simple. A more complex architecture, such as a deep neural network, could learn more complex patterns in the data, which could lead to better accuracy.
  • Use a different optimizer. The SGD optimizer is a good optimizer to start with, but it can be slow to converge on complex problems. A different optimizer, such as Adam or RMSProp, could converge faster and improve accuracy.
  • Uses a different loss function. The negative probability loss loss function is a good choice for classification problems, but may not be the best choice for all problems. A different loss function, such as the cross-entropy loss function, could improve the accuracy.
  • Monitors the training process. Look at the training and validation loss graph to see if the model is converging. If the model is not converging, you may need to tune the model’s hyperparameters, such as the number of epochs or batch size.
  • Use cross validation. Cross validation is a technique that allows you to evaluate the accuracy of the model on an independent data set. This will help you ensure that the model is not overfitting the training data set.
  • Use transfer learning. Transfer learning is a technique that allows you to use a model trained on one data set to improve the accuracy of a model trained on another data set. You can use a model trained on the MNIST dataset to improve the accuracy of your Fashion-MNIST model.
  • Hyperparameter Optimization: Experiment with different hyperparameter values such as learning rate, number of epochs, and batch size to find settings that improve model performance.
  • Using GPU: If you have access to a GPU, use PyTorch on GPU to speed up training. You can move the model and data to GPU using .to(‘cuda’).
  • Reduce redundant code: Remove redundant code or unused functions, such as visualize_a_Batch_of_Training_Data if you don’t need to visualize data.
  • Using pre-trained models: Consider using pre-trained models in FashionMNIST, such as transfer learning models, to improve efficiency and performance.
  • Data parallelization: If you have a multi-core CPU, you can use DataLoader with num_workers to load data in parallel and speed up training.
  • Optimizing graphs and figures: Review the code that generates figures with matplotlib and ensure that there are no unnecessary calls to plotting functions. You can also save shapes in a more efficient format, such as PNG, instead of saving them as an interactive graphic.
  • Implementation of Data Augmentation: You can apply data augmentations to the training set to improve the generalization of the model.
  • Save and load models to a specific location: Instead of saving and loading the model to the current working directory, you can specify a specific location to avoid problems.

Now, regarding setting the hyperparameters in the current state of the code as it is in the gist, some suggestions:

Learning Rate (lr): The learning rate is a critical hyperparameter. You can experiment with different learning rates to find the one that best suits your model. Typical values for SGD are usually in the range of 0.001 to 0.1. Try values like 0.01, 0.001, and 0.0001 to see which works best for your model.
Number of Epochs (epochs): The number of epochs determines how many times the entire training set is traversed. Increasing this value may improve performance, but also increases training time. Start with the current value of 40 epochs and consider increasing it if you don’t see adequate convergence.
Batch Size (batch_size): The batch size influences the training process. Common values are 32, 64, and 128. You can experiment with different batch sizes to see which provides a better balance between training speed and convergence quality.
Network Architecture (model): You can adjust the architecture of your model (the Classifier file) by experimenting with the number of layers and neurons in each layer. Adding additional layers or adjusting the network depth could improve performance.
Loss Function (criterion): In addition to the current loss function (NLLLoss), you can try other loss functions, such as CrossEntropyLoss, to see if they have a positive impact on performance.
Regularization (dropout, weight_decay): Adding dropout layers or using the weight_decay weight parameter in the optimizer can help prevent overfitting.
Data Augmentation: Applying data augmentation transformations, such as rotation, translation, cropping, etc., to the training set can improve the model’s ability to generalize.
Using Different Optimizers: In addition to SGD, you can try other optimizers like Adam or RMSprop to see if they offer faster convergence and better performance.
Using L1 or L2 Regularization: Experiment with L1 or L2 regularization on the loss function to avoid overfitting.
Network Architecture Changes: If possible, consider experimenting with more advanced network architectures, such as convolutional neural networks (CNN), which often work well in computer vision problems.
Data Preprocessing: Ensure that the data is preprocessed in the best possible way, for example, by scaling or normalizing appropriately.

It is important to note that results may vary depending on the data set and the complexity of the problem. Therefore, it is advisable to perform systematic experiments with different combinations of hyperparameters and evaluate performance based on metrics such as precision and loss on the test set to determine which are the optimal hyperparameters for your specific task.

common sense

I hope these suggestions help you improve the accuracy of your model. I will try to expand and improve this text with more suggestions.

Deja un comentario