Google Hacking Database Attributes Enrichment and Conversion to Enable the Application of Machine Learning Techniques

In the initial phase of the pentest, named Open Source Intelligence, we use passive recognition with Google Hacking. Google Hacking is a practice that uses strings called Dorks. To support them, the Google Hacking Database is available with thousands of Dorks. However, the Google Hacking Database contains a reduced number of attributes, all with textual values, which makes it impossible to apply Machine Learning techniques. one way to enrich the Google Hacking Database with attributes is with Natural Language Processing and the transformation of textual values to numeric, converting Dorks characters to ASCII. So, the objective was to apply Natural Language Processing to enrich Google Hacking Database with attributes and convert its textual values to ASCII, to enable the application of Machine Learning techniques. The computational experiments were conducted in seven steps: Selection of the GHDB Base, Removal of Hyperlinks and Deletion of Attributes, Removal of the Site Parameter from Dorks, Removal of Outliers and Stopwords, Enrichment with Natural Language Processing, Base Transformation, and Application of the SOM. The results obtained with the application of the SOM were considered good, depending on the values presented by the metrics that evaluated the network. Thus, it is considered that the objective of this paper was achieved.

1 Introduction one method that can be used to ensure information security is to discover the vulnerabilities where the information is stored.Vulnerabilities represent security aws that pose risks to information [1].A practice used to nd vulnerabilities in web pages is Google Hacking (GH).GH works like a google search that uses a search string, called Dork, which are sets of characters used to perform a speci c search on Google [2].Dorks are used for different purposes, such as nding vulnerabilities in the structure of a website, exposed database les, active service logs, and virus-infected les.To assist the practice of GH, the Google Hacking Database (GHDB) is available on the internet, a base with Dorks evaluated and validated by offensive Security.Despite the number of Dorks, the GHDB contains few attributes, requiring that those who use it have prior knowledge.Furthermore, these few attributes of the base are textual values, not numeric, which limits its use of Machine Learning (ML) techniques, such as Arti cial Neural Networks (ANNs).It is worth highlighting the importance of applying ML techniques in preventing the high number of highly complex attacks that have been taking place.one type of ANN architecture that can be used in the Information Security area is the Kohonen Selforganizing Maps (SOM).According to Kohonen [3], the SOM network is an ANN capable of extracting knowledge from a database, considering all its attributes simultaneously and forming clusters by similarity [4].This capability allows the SOM network to be applied in the IS area for various purposes, such as investigating digital evidence on computers and detecting anomalies in online environments.It is noteworthy that ML techniques have been used in Open-Source Intelligence practices (OSINT), such as the GH whose objective is to collect information from open sources [5].
So that ML techniques can be applied in GHDB it is necessary to enrich the base with attributes, to provide more information for the techniques to conduct their learning.Furthermore, it is also necessary to transform attributes with textual values into numeric values since arti cial neural networks are mathematical models.
As for the GHDB enrichment, the Dorks can be divided by characters applying tokenization by NLP.
Enrichment is the process responsible for adding information to a database, making it suitable for performing a certain task.When new information is added to a database, new facts are added to existing data, thus enabling new approaches to discover knowledge [6].As for the attributes with textual values of the GHDB, one way to transform them into numeric is to perform a character conversion to ASCII [7].So, the aim of this paper was to apply NLP to enrich GHDB with attributes and convert its textual values to ASCII, to enable the application of ML techniques to group Dorks by similarity and nd vulnerabilities.The contributions of this paper are characterized by the description of how to apply NLP to enrich the GHDB, how to transform attributes with textual value or textual Dorks into numerical and how to apply an ANN, in this case, the SOM to group Dorks by similarity, enabling the application of an ML technique on such an important basis.

Google Hacking
Because the source code of web pages is open and accessible over the internet, it is possible to determine its version and structure just by searching for "strings," that is, speci c character sets in search engines [8].Search engines are information retrieval tools in which users enter keywords for queries and subsequently get results automatically.one can mention, for example, these search engines: Google and Yahoo [9].
Researchers from different areas have been studying search engines in different approaches, such as in the search for products, scienti c publications, and in the areas of social marketing, economics, politics, and IS [10].
Roy et al [11] presents a practice for Pentest that uses the Google search engine, to nd vulnerabilities in internet pages only using speci c strings, that is, only using a certain set of characters that may or may not be composed by advanced Google operators.This string used in Google to search for vulnerabilities is called Dork, while the practice for Pentest that uses Google and Dorks is called Google Hacking (GH) or Google Dorking.Table 1presents the ve main items about the practice of GH. azurczyk and Caviglione [12], and Kalech [13] address types of information that can be found with GH.
Information can be server names, open directories, le copies, IP address ranges, critical information about SCADA systems, online services, and devices such as cameras and printers.To understand the impact that the practice of GH can have, Rahman et al. [14] mention the practice of GH and the vulnerabilities present in web applications.The authors discuss how easy it is to nd a vulnerability or sensitive information, such as an IP address or an email address, during a search with a certain Dork available on the GHDB.

Dorks
According to Toffalini et al [2], Dorks are "strings" that can be composed of speci c words and/or parameters developed for search engines to collect information about vulnerabilities or information that help the search for vulnerabilities.In the literature, various Dorks are described for different purposes, for example, to nd vulnerable websites, con dential information, or exposed les.
Pan et al [15] and Quintet, Leonhardt,and Holz [16] describe in their study, categories that can be used to classify the words and parameters that make up the Dorks.Table 2presents the categories, along with their description and examples.Inurl and Intext belong to the GRAM category as they are advanced Google operators.They directed the search to the structure of a particular site.The Inurl parameter searches for sites that contain ".gov.br" in their URL, while the Intext parameter searches for sites that contain DOC category les named "Senhas.xlsx"or "logins.doc" in their content.Thus, this Dork can be used to search for "Passwords.xlsx" or logins.doc"les on websites that contain ".gov.br" in their URL.
According to Zhang,Notani,and Gu [17], and Mider,Garlicki,and Jan [18], the highest concentration of validated and documented Dorks in the world is available in the GHDB.It is the largest and most representative online database of Dorks in the world.A disadvantage of the base is that it has few attributes: only the text of the Dork, the author who published it in the base and category to which the Dork belongs.The Dorks available on the GHDB are classi ed into 14 categories, based on their functionality, that is, on the type of vulnerability they seek.The categories are shown in Table 3.

Natural Language Processing
with the signi cant growth of user-generated content on the Internet, the automatic extraction of relevant information started to receive interest from researchers from different areas.Many of these researchers are achieving this online information extraction through Natural Language Processing [19].Natural Language Processing (NLP) is the subarea of AI responsible for making computers able to interpret and develop content in human language.As it is an interdisciplinary area, it includes other areas such as Computer Science, Linguistics, Psychology, and Statistics [20].
The application of NLP to texts or other human language source content can be performed through several tasks.Among the main tasks, the following stand out: stemming, corpus production, tokenization, lemmatization, grammatical marking, syntactic analysis, and the removal of stopwords [21][22].The main tasks of NLP are described in Table 4.

Task Description
Stemming Used to consolidate different variations of a word that share the same stem into a common root form.For example, the words "Like" and "Likes" will all be simpli ed to the root form of "Lik".

Corpus
It is the formation of a set of all the words present in a text in a single item.Also called "Text Base," it is used in most NLP tasks.

Tokenization
Tokenization, also called "Word Segmentation", is responsible for breaking a certain sequence of characters in a text, that is, it determines where the words of a text start and end and transform them into tokens.Tokens are lists generated from a tokenized corpus.

Lemmatizing
The reduction of super cial words to their canonical form is called a lemma.The motto relates different forms of words with the same meaning.For example, the word "Best" has the word "Good" as its motto.Its use is e cient for information retrieval.

Grammatical Marking (POS)
This is a basic task in linguistics applied to the corpus.The goal is to assign morphosyntactic characteristics to each word in a sentence according to its context.It can also be applied to sentences and paragraphs.

Syntax analysis
The natural successor to grammar markup, parsing provides a dependency tree as the output of each word within a corpus.Its objective is to provide for each sentence or clause, an abstract representation of the grammatical entities and their relationships.

Removal of Stopwords
The removal of stopwords is intended to keep a more concise and cleaner corpus for future analysis.An example of application is the removal of prepositions such as: "of," "if," are", "is", etc.
Frequency Used to produce a list of words and their frequency in each corpus.In addition, it is possible to produce word-frequency lists using a corpus marked with grammatical markup.
Some libraries and tools used to implement and develop algorithms, in addition to the main AI techniques used for NLP [19].
Table 5describes the main libraries and tools used for NLP.As for the application of NLP in the IS area, studies show a trend in its application in Pentest, in the initial step called Recognition or OSINT.The justi cation is that the application of NLP increases the effectiveness in discovering already published and documented vulnerabilities, such as Outdated software versions and online device con guration les [23].

Arti cial Neural Networks
within Arti cial Intelligence (AI) there are sub-areas such as Natural Language Processing (NLP), Computer Vision (CV), and Machine Learning (ML). the ML eld is concerned with the issue of how to build computer programs that automatically improve with experience [24][25].one ML technique that can be used to solve problems in the IS area is Arti cial Neural Networks (ANNs).ANNs can be used for several tasks, such as classi cation, grouping, association, pattern recognition, regression, and prediction [26].
ANNs are mathematical models of arti cial intelligence inspired by the structure of the brain to simulate human behavior in processes such as learning, association, generalization, and abstraction.An ANN can learn and improve its performance based on the environment in which it nds itself.ANNs are very effective in solving nonlinear problems and performing parallel processing.In addition, they can simulate complex systems, an ability that traditional computational techniques lack [27][28].
An important feature of ANNs is the ability to learn incompletely and subject to noise.Fault tolerance is part of the architecture due to the distributed nature of the processing.If a neuron fails, its incorrect output will be replaced by the other correct outputs [28].In ANNs, learning occurs through a set of simple processing units called arti cial neurons.The representation of the basic elements of an arti cial neuron is shown in Fig. 1.The data (input vectors) of the neuron (x1, x2, ..., xn), the input layer neurons (wlj, ..., wnj ) with their respective weights are observed, and then the additive join or sum represented by the letter sigma, then the activation function (φ) and nally the output (y).
The activation function of the arti cial neuron is performed similarly to the synapse on the biological neuron, transmitting or blocking nerve impulses.In this way, the learning of ANNs happens through weight adjustments.The weight value will be determined based on its value in the previous iteration, as shown in Eq. ( 1): 1 Updating the weights depends on the algorithm, but it is based on minimizing the error between the values predicted by the network and the desired outputs, as shown in Eq. ( 2): 2 As for the application of RNA in information security, it is possible to obtain interesting results, such as in the classi cation of malicious and phishing sites, and in the classi cation of tra c that exploits the vulnerability of denial of service in systems information [29][30].

Self-organizing Map
The Self-organizing Map (SOM) proposed by Kohonen [31] is a network built around a one-or twodimensional grid of neurons to capture the important characteristics contained in an input space (data) of interest.The SOM network is an ANN based on unsupervised learning capable of processing input from a multidimensional space, transforming it into a one-dimensional or two-dimensional array.The SOM algorithm is inspired by neurobiology and incorporated all the basic mechanisms for selforganization: competition, cooperation, and self-ampli cation [28] [31].
The structure of the SOM network is composed of neurons interconnected by a relationship called a neighborhood.It is this relationship that determines the topology of the map.For each data provided to the SOM network, there will be a competition among all neurons for the right to represent it.The neuron that wins the competition will be the one with the weight vector with the values closest to the input vector.This type of learning is called competitive learning [4] In Fig. 2, an example of the training phase of the SOM network is presented, simulating 16 neurons simultaneously receiving the input vector X.
When each of the X input vectors is processed by the SOM, each output neuron receives a value and calculates its activation level, according to Eq. ( 3), Where X is the input vector, i is the index that indicates which neuron is receiving the input value and w i s the weight vector between the input value and the neuron.The Best Match Unit (BMU) will be the neuron with the highest u i , that is, the one closest to the input vector.This will be the neuron that will represent the pattern of the input vector data.The Other M neurons compete to determine which one will receive a value closer to the BMU to also remain active.
The SOM network algorithm can be synthesized in ve steps [27][28], which are described in Table 06.

Table 6 Five SOM steps synthesized by Haykin
Step Description

Beginning
Choice of random values for the weight vectors.

Choice of Input Standard
Choosing an x pattern of neurons and determining their neighborhood.

BMU De nition
Choosing the BMU neuron based on the similarity between the neuron's activation level and the input value.
Weight Update Modi cation of the values of the vectors of the weights of the neurons in the network.

Continuation
Repeat steps 2, 3, and 4 until no signi cant changes in the map are observed.
To assess the quality of the map and analyze whether the chosen topology is the one that "best represents the input vector data "X", some quality measures can be used, such as the Quantization Error (QE) and the Topographic Error (TE) [31].Table 7presents the description of each of these measures.Shows the quality of the input vector data.The better the quality of the input vector, the better the arrangement of neurons on the map.The quantization error will be close to zero when all nodes are well distributed in the map.

Topographic Error (TE)
Measures topology preservation of input data.As data is moving from multidimensional space to a two-dimensional or one-dimensional space, they end up losing information.one way to evaluate the representation of the initial input vector is using topographic error.When the topographic error is close to zero, it means that all nodes represent the initial input vector well.

Methodology
The literature review was performed using the following keywords: "Natural Language Processing", "Google Hacking", "GHDB", "Dorks", "Arti cial Neural Networks" in the databases: ACM Digital Library, EmeraldInsight, IeeeDigitalLibrary, and ScienceDirect.The Dorks base selected was the GHDB (https://www.exploit-db.com/google-hacking-database)because the base has the largest number of documented and tested Dorks among all those available on the internet [17][18].The GHDB has a total of 4,211 Dorks and 4 attributes, which are: Date: contains the date the Dork was published in the Base, Dork: contains the Dork and its access link, Category: informs which category the Dork belongs to, and Author: informs who sent Dork to the base.In Table 8, a sample of the GHDB base is presented.The steps of performing the computational experiments shown in Fig. 3 were based on three approaches to perform Open Source Intelligence (OSINT), as shown in Table 9.

Approach
Authors Year OSINT Approach to Support Cybersecurity Operations [32] 2018 OSINT Approach to Inspecting Critical Infrastructure Systems [33] 2016 OSINT Approach to Obtain Intelligence Information from Cyber Threats [34] 2018 The authors reinforce in their work that when running OSINT through an approach, together with ML techniques, it becomes possible to extract new knowledge from the discovered information.g) G -SOM Application: In this step, the SOM was applied to validate the GHDB enrichment and conversion, to generate similar Dorks clusters.Its performance will be evaluated by the Quantization Error (EQ) and Topographic Error (TE) values.Good results obtained in both errors will indicate whether the enriched and converted GHDB enabled the application of ML techniques.

Presentation And Discussion Of Results
The results of the computational experiments obtained with the application of the seven steps are presented below, shown in gure 3.

a)
Step A -Selection of the Google Hacking Database: At this step, Google Hacking Database (GHDB) from offensive Security was selected because it is an online base.It was necessary to the Dorks from the site and export them to a .csvle.The base has a total of 14 categories of Dorks.
For this experiment, the Dorks of the categories: "Advisories and Vulnerabilities" and "Files Containing Juicy Info" were selected as a sample.These categories were chosen because they have the largest number of Dorks, respectively with 1996 and 450 Dorks.b) B -Removing Attributes and Hyperlinks: In Excel, the hyperlink from the Dorks and the author and date attributes from the base were removed, as these attributes do not in uence the Dorks from the base.In this way, the base was left with 2 attributes remaining: Dork and Category.c) Step C -Removing the Site Parameter in Dorks: Speci c Dorks became Dorks capable of running on any site.For this purpose, the Site parameter was removed from the Dorks that had it.To remove the Site parameter, the Excel software was used and searched for the parameter: "Site:".
After nding the Dorks that contained the "Site" parameter, these Dorks were modi ed, removing this parameter.Among the Dorks that had the "Site:" parameter, speci c Dorks were found for Proxy Sites, Google Drive, Github, Media re, Dropbox, Sourceforge, and eBay.d) Step D -Removing Outliers and Stopwords: At this analyzing the Dorks, it was noticed that few had more than 100 characters in their composition.These Dorks had more than 100 characters for two main reasons: Composite Dorks and URLs.Thus, they were considered in this experiment as Outliers.
Composite Dorks are Dorks that have more than one Dork in their String.URLs are links to speci c vulnerabilities on certain websites.Dorks that dealt with URLs were removed, as there would be no way to make them generic and thus automatically run them on other web pages.Composite Dorks were divided into smaller Dorks and then added to the base in their respective categories.
Then, the removal of Stopwords was performed to reduce noise at the base.This was necessary because in the GHDB database there are some Dorks with special characters that, when converted to their numerical value, have a value very different from the alphanumeric characters.To perform the removal of Stopwords, we de ned 40 special characters as Stopwords to be removed.The removed Stopwords were as follows: ,':;"'!?"()`@~/|*[]^_.+\#%¨¬&©ºª}{£¢§e) Step E -Enrichment with Natural Language Processing: In this step, the base Dorks were selected and divided by characters applying tokenization by NLP.It then made each character an attribute in the base.This was necessary because the base had only two attributes so far: Dork and Category.The low number of attributes makes it impossible to apply ML techniques on this basis.
To enrich this base, that is, add new attributes, an algorithm was developed in Python to discover the Dork with the greatest number of characters in its composition, and thus, create the same number of attributes in the Dorks Base.Thus, you can divide the Dork into characters and create new attributes in the base.This action not only enriches the base but also avoids in the next step of the experiment -F, when the Dork is converted to its numeric value in ASCII, that the numeric values obtained from the conversion are extensive, thus making impossible the application of ML techniques.
For example, a 10-character Dork, when converted to its numeric value, becomes a 30-digit numeric value.This is because each character converted to ASCII has a 3-digit numeric value.on the other hand, if each base attribute has only a single character, each attribute will receive a numeric value of 3 digits, enabling the application of intelligent techniques in the base.Thus, 94 attributes were created in the base, named Carac01, Carac02, Carac03 to Carac94.Thus, the database now has a total of 95 attributes, 94 "Carac" attributes added to the Category attribute with numerical values de ned in steps C. The Dork division was performed through the "Nltk.word_tokenize()"function.

f)
Step F -Base Transformation: After applying NLP in phase E, the Dorks characters were converted to their numerical values.For this, we selected the Dorks characters and converted them to their respective.
To conduct this conversion, the study by Guo et al. (2018) converts characters to their numeric ASCII value to detect Memory Over ow vulnerabilities.To conduct this conversion, the "ord( )" function of the Python language was used, the same function used in Guo's study.For example, the Dork: inurl:/phpmyadmin/index.php?db= in step D, this Dork processed along with the other Dorks in the base, and thus, the special characters were removed.So, this Dork became: inurlphpmyadminindexphpdb.
Then, in step E, the Dorks were divided by characters, in this way, this Dork now has 25 characters, and that character was assigned to an attribute.In this way, the rst character of this Dork: "i" was assigned to the attribute: Char01; the second character of this Dork: "n" was assigned to the attribute Char02 and so on until the end of the Dork.The other attributes received a value of 0 in order not to keep the base with null values.
In this phase F, this Dork had its characters converted to its numeric value in ASCII.Thus, the characters of this Dork now have the following value: Step G -SOM Application: After enriching and transforming the Dorks base, SOM was applied to validate the enrichment and conversion performed on the Dorks base, the possibility of applying ML techniques, and nding vulnerabilities in the generated clusters.For this, we sought to extract knowledge from the Dorks base with the application of SOM.To perform the SOM, we de ned the map dimension with 225 neurons, that is, a 15x15 map, and hexagonal topological neighborhood.In addition, the parameters used in the training phase were number of epochs (iterations) equal to 3000 and learning rate equal to 0.5 [3].
For this experiment, all Dorks from the categories: "Advisories and Vulnerabilities" and "Files Containing Juicy Info" were selected as a sample.The two categories have the highest number of Dorks.The "Advisories and Vulnerabilities" category has a total of 1,996 Dorks, who search web pages with unprotected les.The application of SOM generated a map with 3 groups.This map is shown in gure 4.
The application of the SOM network generated three groups in the Advisories and Vulnerabilities base.
Table 10 shows the characteristics of each one of them.The application of SOM generated four groups in the "Files Containing Juicy Info" base.
Table 11 shows the characteristics of each one of them.12, the errors had values close to 0. This means that the topology of the input data was preserved, that is, that all nodes well represented the initial input vector.Thus, the errors obtained in the application of SOM can be considered as good.Thus, it is understood that the enrichment of the GHDB database with NLP, together with the conversion of Dorks characters to numeric values in ASCII made it possible to apply an ML technique to generate similar Dorks groupings and to identify vulnerabilities.

Conclusion
This paper applied NLP to enrich the attributes of GHDB and convert its textual values into numeric values, using ASCII code to apply ML techniques.Therefore, the developed computational experiments included seven steps, which culminated in the validation by SOM of the GHBD enrichment and conversion, in addition to the generation of clusters with similar Dorks and the identi cation of vulnerabilities.
The results obtained with the application of the SOM were considered good, depending on the values presented by the metrics that evaluated the network.Thus, it is considered that the objective of this paper was achieved.
with the base enriched and converted, it becomes possible to use other ML techniques to automate information security tests, such as in the construction of OSINT approaches or even for the creation of rules for defense systems such as Firewalls, IDS, and IPS, making them those capable of detecting GHDB practices.
Among the limitations observed in this paper, the de nition of stopwords stands out, because it does not nd a pre-de ned set of special characters, and the lack of studies in the literature to compare the results since the phases of conducting computational experiments was inspired by three different approaches.
The study conducted here does not intend to exhaust the subject, on the contrary, it sought to contribute to the Information Security area about the application of ML techniques in the identi cation of vulnerabilities when enriching and converting the GHBD.It is expected that the phases presented and applied in computational experiments can stimulate further research.This scenario, therefore, offers ample room for continuation work.Competitive learning in a 16-neuron SOM network Steps of Computational Experiments So, the computational experiments were conducted in seven steps: Selection of the GHDB Base, Removal of Hyperlinks and Deletion of Attributes, Removal of the Site Parameter from Dorks, Removal of Outliers and Stopwords, Enrichment with Natural Language Processing, Base Transformation and Application of the SOM.Figure 3 presents the owchart with the seven steps of computational experiments.a) Step A -of the Google Hacking Database: In this step, the GHDB base was selected to conduct the computational experiments.b) Step B -Removing Attributes Hyperlinks: In this step, the hyperlinks embedded in Dorks were removed, along with nominal attributes from the GHDB database that were disregarded.c) Step C -Removing the Site Parameter in Dorks: In this step, speci c Dorks became in Dorks capable of running on any site.For this, the "Site" parameter present in Dorks was removed.d) Step D -Removing Outliers and Stopwords: In this step, removal of Outliers and Stopwords was conducted.Removed Stopwords were special characters present in Dorks.The removed Outliers were Composite Dorks and URLs.e) Step E -Enrichment with Natural Language Processing: In this step, the base Dorks were selected and divided by characters applying tokenization by NLP.Then, the enrichment was conducted, transforming each Dork character into an attribute.f) Step F -Base Transformation: In this step, the base Dorks were selected and converted to their respective numerical values in ASCII.

Figures Figure 1
Figures

Table 1
Five main items for the practice of Google Hacking uses Google's caching system to go directly to a snapshot of a web page.That is, it is possible to extract information from that page without entering the domain, thus managing to consult the pages without establishing any direct connection with the destination.
The GH practice not only discovers vulnerabilities in the structure of a web page, but it can also discover les that are open to the public, such as password logs (explicit, hash, encrypted, etc.), logins, databases, among others.05GoogleHackingDatabaseGoogle Hacking Database (GHDB) is a database with thousands of Dorks evaluated and validated by offensive Security.

Table 3
E-commerce pages that display unprotected information Network of Vulnerability Data Pages that display vulnerable data about the structure of a network Pages Containing Login Portals Pages containing vulnerable login portals Various online Devices Pages containing unprotected online devices Advisories and Vulnerabilities Pages that contain vulnerabilities coming from advertisements The use of Dorks contained in the GHDB in the GH practice allows nding vulnerabilities in Web pages already in the initial step of a Pentest, called Recognition or OSINT [18].

Table 5
Main libraries and tools for NLP

Table 7 -
Accuracy measures for SOM

Table 10 Map
Characteristics in the Advisories and Vulnerabilities Category a map with 4 clusters.This map is shown in gure 5.

Table 11 Map
Characteristics in the Files Containing Juicy Info CategoryIt is observed in Table11that Dorks address vulnerabilities that allow exploiting unprotected les with information about other systems on web pages.These vulnerabilities are of various extensions, spanning technologies such as SQL and Netscape.Then, the quality of the map generated by the SOM was evaluated through Quantization Error (QE) and the Topographic Error (TE).Values are shown in table12.Table12Results of the metrics of the maps generated by the SOM network