Skip to content

Commit 6efb213

Browse files
authored
Merge pull request #21523 from owen-mc/docs/mad/barriers
Document models-as-data barriers and barrier guards and add change notes
2 parents c91b5b3 + 73cc54c commit 6efb213

18 files changed

+961
-468
lines changed
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
category: feature
3+
---
4+
* Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for C and C++](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-cpp/).
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
category: feature
3+
---
4+
* Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for C#](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-csharp/).

docs/codeql/codeql-language-guides/customizing-library-models-for-actions.rst

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -24,26 +24,26 @@ The CodeQL library for GitHub Actions exposes the following extensible predicate
2424

2525
Customizing data flow and taint tracking:
2626

27-
- **actionsSourceModel**\(action, version, output, kind, provenance)
28-
- **actionsSinkModel**\(action, version, input, kind, provenance)
29-
- **actionsSummaryModel**\(action, version, input, output, kind, provenance)
27+
- ``actionsSourceModel(action, version, output, kind, provenance)``
28+
- ``actionsSinkModel(action, version, input, kind, provenance)``
29+
- ``actionsSummaryModel(action, version, input, output, kind, provenance)``
3030

3131
Customizing Actions-specific analysis:
3232

33-
- **argumentInjectionSinksDataModel**\(regexp, command_group, argument_group)
34-
- **contextTriggerDataModel**\(trigger, context_prefix)
35-
- **externallyTriggerableEventsDataModel**\(event)
36-
- **immutableActionsDataModel**\(action)
37-
- **poisonableActionsDataModel**\(action)
38-
- **poisonableCommandsDataModel**\(regexp)
39-
- **poisonableLocalScriptsDataModel**\(regexp, group)
40-
- **repositoryDataModel**\(visibility, default_branch_name)
41-
- **trustedActionsOwnerDataModel**\(owner)
42-
- **untrustedEventPropertiesDataModel**\(property, kind)
43-
- **untrustedGhCommandDataModel**\(cmd_regex, flag)
44-
- **untrustedGitCommandDataModel**\(cmd_regex, flag)
45-
- **vulnerableActionsDataModel**\(action, vulnerable_version, vulnerable_sha, fixed_version)
46-
- **workflowDataModel**\(path, trigger, job, secrets_source, permissions, runner)
33+
- ``argumentInjectionSinksDataModel(regexp, command_group, argument_group)``
34+
- ``contextTriggerDataModel(trigger, context_prefix)``
35+
- ``externallyTriggerableEventsDataModel(event)``
36+
- ``immutableActionsDataModel(action)``
37+
- ``poisonableActionsDataModel(action)``
38+
- ``poisonableCommandsDataModel(regexp)``
39+
- ``poisonableLocalScriptsDataModel(regexp, group)``
40+
- ``repositoryDataModel(visibility, default_branch_name)``
41+
- ``trustedActionsOwnerDataModel(owner)``
42+
- ``untrustedEventPropertiesDataModel(property, kind)``
43+
- ``untrustedGhCommandDataModel(cmd_regex, flag)``
44+
- ``untrustedGitCommandDataModel(cmd_regex, flag)``
45+
- ``vulnerableActionsDataModel(action, vulnerable_version, vulnerable_sha, fixed_version)``
46+
- ``workflowDataModel(path, trigger, job, secrets_source, permissions, runner)``
4747

4848
Examples of custom model definitions
4949
------------------------------------
@@ -62,9 +62,9 @@ To allow any Action from the publisher ``octodemo``, such as ``octodemo/3rd-part
6262
.. code-block:: yaml
6363
6464
extensions:
65-
- addsTo:
65+
- addsTo:
6666
pack: codeql/actions-all
67-
extensible: trustedActionsOwnerDataModel
67+
extensible: trustedActionsOwnerDataModel
6868
data:
6969
- ["octodemo"]
7070

docs/codeql/codeql-language-guides/customizing-library-models-for-cpp.rst

Lines changed: 91 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ The CodeQL library for CPP analysis exposes the following extensible predicates:
5858
- ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model sources of potentially tainted data. The ``kind`` of the sources defined using this predicate determine which threat model they are associated with. Different threat models can be used to customize the sources used in an analysis. For more information, see ":ref:`Threat models <threat-models-cpp>`."
5959
- ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)``. This is used to model sinks where tainted data may be used in a way that makes the code vulnerable.
6060
- ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to model flow through elements.
61+
- ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model barriers, which are elements that stop the flow of taint.
62+
- ``barrierGuardModel(namespace, type, subtypes, name, signature, ext, input, acceptingValue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check.
6163

6264
The extensible predicates are populated using the models defined in data extension files.
6365

@@ -75,7 +77,7 @@ This example shows how the CPP query pack models the return value from the ``rea
7577
7678
boost::asio::read_until(socket, recv_buffer, '\0', error);
7779
78-
We need to add a tuple to the ``sourceModel``\(namespace, type, subtypes, name, signature, ext, output, kind, provenance) extensible predicate by updating a data extension file.
80+
We need to add a tuple to the ``sourceModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)`` extensible predicate by updating a data extension file.
7981

8082
.. code-block:: yaml
8183
@@ -86,12 +88,11 @@ We need to add a tuple to the ``sourceModel``\(namespace, type, subtypes, name,
8688
data:
8789
- ["boost::asio", "", False, "read_until", "", "", "Argument[*1]", "remote", "manual"]
8890
89-
Since we are adding a new source, we need to add a tuple to the ``sourceModel`` extensible predicate.
9091
The first five values identify the callable (in this case a free function) to be modeled as a source.
9192

9293
- The first value ``"boost::asio"`` is the namespace name.
93-
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
94-
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
94+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
95+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
9596
- The fourth value ``"read_until"`` is the function name.
9697
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``read_until``.
9798

@@ -111,7 +112,7 @@ This example shows how the CPP query pack models the second argument of the ``bo
111112
112113
boost::asio::write(socket, send_buffer, error);
113114
114-
We need to add a tuple to the ``sinkModel``\(namespace, type, subtypes, name, signature, ext, input, kind, provenance) extensible predicate by updating a data extension file.
115+
We need to add a tuple to the ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)`` extensible predicate by updating a data extension file.
115116

116117
.. code-block:: yaml
117118
@@ -122,12 +123,11 @@ We need to add a tuple to the ``sinkModel``\(namespace, type, subtypes, name, si
122123
data:
123124
- ["boost::asio", "", False, "write", "", "", "Argument[*1]", "remote-sink", "manual"]
124125
125-
Since we want to add a new sink, we need to add a tuple to the ``sinkModel`` extensible predicate.
126126
The first five values identify the callable (in this case a free function) to be modeled as a sink.
127127

128128
- The first value ``"boost::asio"`` is the namespace name.
129-
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
130-
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
129+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
130+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
131131
- The fourth value ``"write"`` is the function name.
132132
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``write``.
133133

@@ -147,7 +147,7 @@ This example shows how the CPP query pack models flow through a function for a s
147147
148148
boost::asio::write(socket, boost::asio::buffer(send_str), error);
149149
150-
We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance) extensible predicate by updating a data extension file:
150+
We need to add tuples to the ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)`` extensible predicate by updating a data extension file:
151151

152152
.. code-block:: yaml
153153
@@ -158,13 +158,11 @@ We need to add tuples to the ``summaryModel``\(namespace, type, subtypes, name,
158158
data:
159159
- ["boost::asio", "", False, "buffer", "", "", "Argument[*0]", "ReturnValue", "taint", "manual"]
160160
161-
Since we are adding flow through a function, we need to add tuples to the ``summaryModel`` extensible predicate.
162-
163161
The first five values identify the callable (in this case free function) to be modeled as a summary.
164162

165163
- The first value ``"boost::asio"`` is the namespace name.
166-
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modelling a free function, the type is left blank.
167-
- The third value ``False`` is a flag that indicates whether or not the sink also applies to all overrides of the method. For a free function, this should be ``False``.
164+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
165+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
168166
- The fourth value ``"buffer"`` is the function name.
169167
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name. In this case, we want the model to include all functions in ``boost::asio`` called ``buffer``.
170168

@@ -176,6 +174,86 @@ The remaining values are used to define the input and output specifications, the
176174
- The ninth value ``"taint"`` is the kind of the flow. ``taint`` means that taint is propagated through the call.
177175
- The tenth value ``"manual"`` is the provenance of the summary, which is used to identify the origin of the summary model.
178176

177+
Example: Taint barrier using the ``mysql_real_escape_string`` function
178+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
179+
180+
This example shows how the CPP query pack models the ``mysql_real_escape_string`` function as a barrier for SQL injection.
181+
This function escapes special characters in a string for use in an SQL statement, which prevents SQL injection attacks.
182+
183+
.. code-block:: cpp
184+
185+
char *query = "SELECT * FROM users WHERE name = '%s'";
186+
char *name = get_untrusted_input();
187+
char *escaped_name = new char[2 * strlen(name) + 1];
188+
mysql_real_escape_string(mysql, escaped_name, name, strlen(name)); // The escaped_name is safe for SQL injection.
189+
sprintf(query_buffer, query, escaped_name);
190+
191+
We need to add a tuple to the ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)`` extensible predicate by updating a data extension file.
192+
193+
.. code-block:: yaml
194+
195+
extensions:
196+
- addsTo:
197+
pack: codeql/cpp-all
198+
extensible: barrierModel
199+
data:
200+
- ["", "", False, "mysql_real_escape_string", "", "", "Argument[*1]", "sql-injection", "manual"]
201+
202+
The first five values identify the callable (in this case a free function) to be modeled as a barrier.
203+
204+
- The first value ``""`` is the namespace name.
205+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
206+
- The third value ``False`` is a flag that indicates whether or not the model also applies to all overrides of the method. For a free function, this should be ``False``.
207+
- The fourth value ``"mysql_real_escape_string"`` is the function name.
208+
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name.
209+
210+
The sixth value should be left empty and is out of scope for this documentation.
211+
The remaining values are used to define the output specification, the ``kind``, and the ``provenance`` (origin) of the barrier.
212+
213+
- The seventh value ``"Argument[*1]"`` is the output specification, which means in this case that the barrier is the first indirection (or pointed-to value, ``*``) of the second argument (``Argument[1]``) passed to the function.
214+
- The eighth value ``"sql-injection"`` is the kind of the barrier. The barrier kind is used to define the queries where the barrier is in scope.
215+
- The ninth value ``"manual"`` is the provenance of the barrier, which is used to identify the origin of the barrier model.
216+
217+
Example: Add a barrier guard
218+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
219+
220+
This example shows how to model a barrier guard that stops the flow of taint when a conditional check is performed on data.
221+
A barrier guard model is used when a function returns a boolean that indicates whether the data is safe to use.
222+
Consider a function called ``is_safe`` which returns ``true`` when the data is considered safe.
223+
224+
.. code-block:: cpp
225+
226+
if (is_safe(user_input)) { // The check guards the use, so the input is safe.
227+
mysql_query(user_input); // This is safe.
228+
}
229+
230+
We need to add a tuple to the ``barrierGuardModel(namespace, type, subtypes, name, signature, ext, input, acceptingValue, kind, provenance)`` extensible predicate by updating a data extension file.
231+
232+
.. code-block:: yaml
233+
234+
extensions:
235+
- addsTo:
236+
pack: codeql/cpp-all
237+
extensible: barrierGuardModel
238+
data:
239+
- ["", "", False, "is_safe", "", "", "Argument[*0]", "true", "sql-injection", "manual"]
240+
241+
The first five values identify the callable (in this case a free function) to be modeled as a barrier guard.
242+
243+
- The first value ``""`` is the namespace name.
244+
- The second value ``""`` is the name of the type (class) that contains the method. Because we're modeling a free function, the type is left blank.
245+
- The third value ``False`` is a flag that indicates whether or not the model guard also applies to all overrides of the method. For a free function, this should be ``False``.
246+
- The fourth value ``"is_safe"`` is the function name.
247+
- The fifth value is the function input type signature, which can be used to narrow down between functions that have the same name.
248+
249+
The sixth value should be left empty and is out of scope for this documentation.
250+
The remaining values are used to define the input specification, the ``accepting-value``, the ``kind``, and the ``provenance`` (origin) of the barrier guard.
251+
252+
- The seventh value ``Argument[*0]`` is the input specification (the value being validated). In this case, the first indirection (or pointed-to value, ``*``) of the first argument (``Argument[0]``) passed to the function.
253+
- The eighth value ``true`` is the accepting value of the barrier guard. This is the value that the conditional check must return for the barrier to apply.
254+
- The ninth value ``sql-injection`` is the kind of the barrier guard. The barrier guard kind is used to define the queries where the barrier guard is in scope.
255+
- The tenth value ``manual`` is the provenance of the barrier guard, which is used to identify the origin of the barrier guard.
256+
179257
.. _threat-models-cpp:
180258

181259
Threat models

0 commit comments

Comments
 (0)