Synthia
Generic and flexible data structure generator
GenerateLog.java
Go to the documentation of this file.
1 /*
2  Synthia, a data structure generator
3  Copyright (C) 2019-2020 Laboratoire d'informatique formelle
4  Université du Québec à Chicoutimi, Canada
5 
6  This program is free software: you can redistribute it and/or modify
7  it under the terms of the GNU Lesser General Public License as published
8  by the Free Software Foundation, either version 3 of the License, or
9  (at your option) any later version.
10 
11  This program is distributed in the hope that it will be useful,
12  but WITHOUT ANY WARRANTY; without even the implied warranty of
13  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
14  GNU Lesser General Public License for more details.
15 
16  You should have received a copy of the GNU Lesser General Public License
17  along with this program. If not, see <http://www.gnu.org/licenses/>.
18  */
19 package examples.apache;
20 
21 import ca.uqac.lif.synthia.Picker;
32 import ca.uqac.lif.synthia.util.AsLong;
33 import ca.uqac.lif.synthia.util.Choice;
35 import ca.uqac.lif.synthia.util.Tick;
37 
38 /**
39  * Main program that generates the simulated log file interleaving multiple
40  * visitor instances.
41  *
42  * <h4>Site map</h4>
43  * The generation of the site map recycles the Barabási–Albert model from
44  * {@linkplain BarabasiAlbert another example}. A scale-free graph of a
45  * randomly selected number of nodes is first generated; the graph results in
46  * a few nodes with high degree (corresponding to "main" pages) and a majority
47  * of nodes with low degree. A {@link StringPattern} picker produces filenames
48  * for each node using randomly generated strings and appending the
49  * <tt>html</tt> extension to each of them. An example of such a map is shown
50  * below:
51  * <p>
52  * <img src="./doc-files/apache/map.png" width="50%" alt = "Example of site map" />
53  * <p>
54  * The site map is then turned into a {@link MarkovChain}, where each
55  * undirected edge between vertices A and B stands both for a link between A and
56  * B, and vice versa. Each outgoing edge of a given vertex is given the same
57  * probability.
58  *
59  * <h4>Visitors</h4>
60  *
61  * A visitor is a {@link LogLinePicker} that is fed with a picker for its IP
62  * address, another one providing page names, and a last one providing a
63  * timestamp.
64  * <p>
65  * The IP addresses for each
66  * visitor are pseudo-randomly generated using a {@link StringPattern} picker,
67  * with different probabilities associated to different regions. For the
68  * purpose of the simulation:
69  * <ul>
70  * <li>addresses of the form <tt>11.x.x.x</tt> are considered to be in the USA
71  * and have a 1/2 probability of being generated</li>
72  * <li>addresses of the form <tt>10.x.x.x</tt> are considered to be in Canada
73  * and have a 1/3 probability of being generated</li>
74  * <li>addresses in the range form <tt>20.x.x.x</tt>-<tt>60.x.x.x</tt> are
75  * considered to be in Europe and have a 1/6 probability of being generated</li>
76  * </ul>
77  * The picker for page names is an instance of the Markov chain defined
78  * previously. It should be noted that the {@link VisitorPicker} gives a
79  * distinct copy of the chain to each visitor; hence, each visitor does its own
80  * independent random walk, but the site map they use is the same for all.
81  * <p>
82  * Finally, the picker for the timestamp is an instance of {@link Tick}, which
83  * randomly increments a timestamp counter every time it is called. Note that
84  * the same <tt>Tick</tt> instance is shared by all visitors, which ensures
85  * that the global timestamp increments on each page load, regardless of which
86  * visitor requested the page.
87  *
88  * <h4>Generating the log</h4>
89  * To simulate the action of multiple visitors in the site, it then suffices to
90  * pass the {@link VisitorPicker} to an instance of {@link Knit}, which takes
91  * care of creating new visitors and interleaving the progression of each of
92  * them. Each call to {@link Knit#pick() pick()} selects one visitor, takes a
93  * transition, and outputs the log line resulting from the corresponding
94  * simulated page request.
95  * <p>
96  * A typical run of the program looks like this:
97  * <pre>
98  * 11.205.232.45 - - [26-Oct-1985 01:21:00 EDT 0] "GET J0.html HTTP/2" 200 1000
99  * 11.130.222.207 - - [26-Oct-1985 01:21:04 EDT 0] "GET Xv.html HTTP/2" 200 1000
100  * 11.130.222.207 - - [26-Oct-1985 01:21:09 EDT 0] "GET 9y.html HTTP/2" 200 1000
101  * 11.16.2.198 - - [26-Oct-1985 01:21:17 EDT 0] "GET Xv.html HTTP/2" 200 1000
102  * 11.130.222.207 - - [26-Oct-1985 01:21:27 EDT 0] "GET t0.html HTTP/2" 200 1000
103  * 45.86.208.44 - - [26-Oct-1985 01:21:29 EDT 0] "GET b7.html HTTP/2" 200 1000
104  * 10.112.251.183 - - [26-Oct-1985 01:21:34 EDT 0] "GET b7.html HTTP/2" 200 1000
105  * 11.130.222.207 - - [26-Oct-1985 01:21:40 EDT 0] "GET 9y.html HTTP/2" 200 1000
106  * 11.16.2.198 - - [26-Oct-1985 01:21:41 EDT 0] "GET 9y.html HTTP/2" 200 1000
107  * 45.86.208.44 - - [26-Oct-1985 01:21:45 EDT 0] "GET 9y.html HTTP/2" 200 1000
108  * &hellip;
109  * </pre>
110  *
111  * <h4>Exercises</h4>
112  * <ol>
113  * <li>Modify the scenario so that each page is assigned a randomly selected
114  * size, instead of the constant "1000" that appears on each line. For a given
115  * page, make sure that the same size is always shown.</li>
116  * <li>Modify the Markov chain to add a sink state accessible from every state,
117  * and which will cause a visitor to "leave" the site (i.e. no longer produce
118  * any new page request).</li>
119  * <li>Modify the scenario so that some pages actually do not exist, and result
120  * in a return code of 404 instead of 200 (next to last element of each log
121  * line).</li>
122  * </ol>
123  * @ingroup Examples
124  */
125 public class GenerateLog
126 {
127 
128  public static void main(String[] args)
129  {
130  int seed = 10;
131 
132  // Generate a site map, then turn it into a Markov chain
133  Picker<Integer> size = new RandomInteger(15, 20).setSeed(seed);
134  RandomBoolean coin = new RandomBoolean();
135  coin.setSeed(0);
137  new StringNodePicker(new StringPattern("{$0}.html", new RandomString(new RandomInteger(2,3)).setSeed(seed))),
138  size, coin);
140  public Picker<String> getPicker(String s) {
141  return new Constant<String>(s);
142  }
143  }.asMarkovChain(map_gen.pick(), RandomFloat.instance);
144 
145  // Generate a pattern of IP addresses
146  StringPattern ip = new StringPattern("{$0}.{$1}.{$2}.{$3}",
148  .add("10", 1d/3).add("11", 1d/2).add(new AsString(new RandomInteger(20, 61)), 1d/6),
149  new AsString(new RandomInteger(0, 256)),
150  new AsString(new RandomInteger(0, 256)),
151  new AsString(new RandomInteger(0, 256))
152  );
153 
154  // Interleave multiple visitor instances
156  new AsLong(new Tick(499152060000L, new RandomInteger(1000, 10000))), ip, chain),
157  new RandomBoolean(0.5), new RandomBoolean(0.1), RandomFloat.instance);
158 
159  // Print the first few log lines
160  for (int i = 0; i < 25; i++)
161  {
162  System.out.println(all.pick());
163  }
164  }
165 
166 }
ca.uqac.lif.synthia.Picker
Picks an object.
Definition: Picker.java:36
ca.uqac.lif.synthia.tree.StringNodePicker
Definition: StringNodePicker.java:23
ca.uqac.lif.synthia.string.StringPattern
Generates a string according to a predefined pattern.
Definition: StringPattern.java:41
examples.apache.GenerateLog
Main program that generates the simulated log file interleaving multiple visitor instances.
Definition: GenerateLog.java:125
ca.uqac.lif.synthia.tree.MarkovReader
Converts a graph into an equivalent Markov chain.
Definition: MarkovReader.java:23
ca.uqac.lif.synthia.util
Miscellaneous pickers performing various functions.
Definition: ArrayPicker.java:19
examples.graphs.BarabasiAlbert
Generates a graph following the Barabási–Albert model.
Definition: BarabasiAlbert.java:54
ca.uqac.lif.synthia.random.RandomFloat.instance
static final transient RandomFloat instance
A public static instance of RandomFloat.
Definition: RandomFloat.java:45
ca.uqac.lif.synthia.string.AsString
Utility picker that converts an input into a string.
Definition: AsString.java:28
ca.uqac
ca.uqac.lif.synthia.util.AsLong
Utility picker that converts an input into a long integer.
Definition: AsLong.java:28
examples.apache.GenerateLog.main
static void main(String[] args)
Definition: GenerateLog.java:128
examples.apache.VisitorPicker
A picker producing instances of visitors.
Definition: VisitorPicker.java:32
ca.uqac.lif.synthia.random
Pickers that produce pseudo-random objects such as numbers.
Definition: AffineTransform.java:19
ca.uqac.lif.synthia
Definition: Bounded.java:19
ca.uqac.lif.synthia.util.Constant
Picker that returns the same object every time.
Definition: Constant.java:37
ca.uqac.lif.synthia.string
Pickers producing and manipulating character strings.
Definition: AsString.java:19
ca.uqac.lif.synthia.tree
Pickers for the generation of trees made of nodes with labels.
Definition: ColoredNodePicker.java:19
examples.graphs
Illustrates pickers generating various trees and general graphs.
Definition: BarabasiAlbert.java:19
examples.graphs.BarabasiAlbert.pick
Node< T > pick()
Picks an object.
Definition: BarabasiAlbert.java:80
ca.uqac.lif.synthia.sequence.Knit.pick
T pick()
Picks an object.
Definition: Knit.java:116
ca.uqac.lif
ca.uqac.lif.synthia.util.Choice.add
Choice< T > add(ProbabilityChoice< T > pc)
Adds an object-probability association.
Definition: Choice.java:72
ca.uqac.lif.synthia.random.RandomInteger.setSeed
RandomInteger setSeed(int seed)
Definition: RandomInteger.java:102
ca.uqac.lif.synthia.random.RandomInteger
Picks an integer uniformly in an interval.
Definition: RandomInteger.java:31
ca.uqac.lif.synthia.string.RandomString
Generates a random character string.
Definition: RandomString.java:37
ca.uqac.lif.synthia.util.Tick
Generates a sequence of monotonically increasing numerical values.
Definition: Tick.java:51
ca
ca.uqac.lif.synthia.sequence.MarkovChain
Generates a sequence of objects by a random walk in a Markov chain.
Definition: MarkovChain.java:47
ca.uqac.lif.synthia.sequence.Knit
Picker producing an "interleaved" sequence of objects from calls to multiple other pickers.
Definition: Knit.java:51
ca.uqac.lif.synthia.random.RandomPicker.setSeed
RandomPicker< T > setSeed(int seed)
Set the seed of the random generator.
Definition: RandomPicker.java:60
examples
ca.uqac.lif.synthia.random.RandomFloat
Picks a floating point number uniformly in an interval.
Definition: RandomFloat.java:30
ca.uqac.lif.synthia.util.Choice
Picks an element from a collection, where the probability of picking each element can be user-defined...
Definition: Choice.java:44
ca.uqac.lif.synthia.sequence
Pickers related to the generation of a sequence of values.
Definition: BehaviorTree.java:19
ca.uqac.lif.synthia.random.RandomBoolean
Picks a Boolean value.
Definition: RandomBoolean.java:34