Bundle Network
This methodology substitutes the resolution of the (Quadratic) Master Problem in the Bundle Method, computing the search direction as a convex combination of the sub-gradients at visited points, with an ML model based on Recursion and Attention.
We assume that you have already downloaded or created your data and saved it in ./data/<your_folder>/. This folder should contain 2000 instances, with 1000 used for training, 500 for validation, and 500 for testing.
Training
Here we demonstrate how to train such a model, with or without batching.
Without Batching
julia --project=. ./runs/runTraining.jl --data ./data/<your_folder>/ --lr 1.0e-4 --decay 0.9 --cn 5 --mti 1000 --mvi 500 --seed 1 --maxIt 10 --maxEP 10 --soft_updates true --h_representation 32 --use_softmax false --gamma 0.999 --lambda 0.0 --delta 0.0 --use_graph false --maxItBack -1 --maxItVal 20 --batch_size 2 --always_batch true --h_act softplus --sampling_gamma falseUsing Batching
julia --project=. ./runs/runTraining.jl --data ./data/<your_folder>/ --lr 1.0e-4 --decay 0.9 --cn 5 --mti 100 --mvi 500 --seed 1 --maxIt 10 --maxEP 10 --soft_updates true --h_representation 32 --use_softmax false --gamma 0.999 --lambda 0.0 --delta 0.0 --use_graph false --maxItBack -1 --maxItVal 20 --batch_size 1 --always_batch true --h_act softplus --sampling_gamma falseParameter Explanation
--data ./data/<your_folder>/: Path to the folder containing the dataset instances.--lr 1.0e-4: Learning rate for updating the model parameters.--decay 0.9: Decay factor for the optimizer.--cn 5: Coefficient to prevent the gradient norm from becoming too large.--mti 1000: Number of training instances.--mvi 500: Number of validation instances.--seed 1: Random seed.--maxIt 10: Number of iterations (unrolls) of the Bundle method per call.--maxEP 10: Number of training epochs.--soft_updates true: If false, uses the standard bundle strategy; if true, uses a smoothed update based on the softplus function.--h_representation 32: Size of the hidden representation for each visited point.--use_softmax false: If true, uses softmax to predict the convex combination coefficients; otherwise, uses sparsemax.--gamma 0.999: Coefficient in the telescopic sum, larger for the last point, smaller for earlier points.--lambda 0.0: Final contribution weight:lambda * final_point + (1 - lambda) * stabilization_point.--delta 0.0: Additional regularization parameter.--use_graph false: If true, uses a bipartite graph representation instead of recurrence (still in development; not recommended).--maxItBack -1: (Description missing; please clarify if needed.)--maxItVal 20: Number of iterations of the Bundle method for validation instances.--batch_size 1: Batch size.--sampling_gamma false: If true, uses sampling in the hidden representation; otherwise, no.--always_batch true: Ifbatch_sizeis 1 and this is true, batching is enforced anyway.--h_act softplus: Activation function used in hidden layers.
Testing
To evaluate a pretrained model's performance:
julia --project=. ./runs/runTest.jl --folder ./data/<your_folder>/ --model_folder BatchVersion_bs_1_true_<your_folder>_0.0001_0.9_5_1000_500_1_10_100_true_32_false_softplus_false_0.999_0.0_0.0_sparsemax_false_false_1.0Parameters Explanation
--data ./data/<your_folder>/: Path to the dataset folder.--dataset ./res/BatchVersion_bs_1_true_<your_folder>_0.0001_0.9_5_1000_500_1_10_100_true_32_false_softplus_false_0.999_0.0_0.0_sparsemax_false_false_1.0/: Path to the model folder, usually obtained from training.--model: (Optional) Path to a specific model folder; if omitted, defaults to the associated test instances folder.
Further References
- Hyper-Parameter Learning: Learning hyperparameters in the (aggregated) Bundle method.
- Baselines: Grid-search based baselines, including classic (aggregated) Bundle methods with various strategies, Adam, and Gradient Descent.