Leveraging ANTLR for IAM Policy Automation

October 20, 2023 | 11 minute read
Gordon Trevorrow
Principal Cloud Architect
Text Size 100%:

Text 2 Tree

Teams working with OCI may sometimes need the ability to programmatically analyze, modify, and create IAM policy statements. For example, teams might want to validate new IAM policies against existing ones to detect redundant policy statements, assess user access, and generate reports. They might also want to implement IAM policy meta policies (policies about policies) to control the content of valid policies within an OCI account, or make bulk changes to existing policies.

To achieve this quickly and efficiently, they need a way to generate a parse tree from IAM policy statements. A parse tree is a data structure that represents the syntactic structure of a policy statement. By having a parse tree, they can computationally scrutinize each part of a policy declaration. This article will go into detail on how to create a lexer and parser using ANTLR that can generate such an IAM policy parse tree.

Employing regular expressions to parse a language like OCI IAM policies is not the ideal approach as OCI IAM policy expressions may contain recursive conditions like ANY {a, ALL {b, c, ANY {d, e}}}.

 

A short introduction to ANTLR

ANTLR (ANother Tool for Language Recognition) is a tool designed for developers who want to construct parsers that can recognize the structure of computer languages. To use ANTLR, you must create a grammar. A grammar is a set of rules expressed in a textual form that defines the syntax of the language. From a grammar definition, ANTLR can then generate a lexer and parser for the language. The lexer and parser are generated in a programming language that ANTLR supports, such as Java, Python, C#, and so on. In addition, ANTLR will generate listeners and visitors that allow a program to "walk" the parse tree and perform actions based on identified language structures.

 

Defining an ANTLR grammar for the OCI IAM Policy language

As mentioned, to get ANTLR to generate a parser for the OCI IAM Policy language, we start by creating an ANTLR grammar that contains the lexical and syntactic ( parser) rules of the OCI IAM policy Domain Specific Langauge (DSL).

 

This article presents an ANTLR grammar for IAM policies that I created as part of a project to perform bulk analysis and refactoring of IAM policies. The grammar is suitable for the purpose of parsing valid IAM policy statements, but it is not suitable for detecting all types of invalid IAM policy statements. Additional refinement will be necessary to use it for IAM policy validation.

 

ANTLR uses the lexical rules to generate a tokenizer/lexer that can scan an OCI IAM policy statement and produce a stream of language tokens.Additionally, the grammar includes ANTLR fragments, which are reusable elements of the lexer rules, but that do not form tokens of the DSL on their own.

/* * Lexer Rules */ 
BEFORE : B E F O R E ; 
BETWEEN : B E T W E E N; 
NEWLINE : ('\r'? '\n' | '\r')+ -> skip; 
WS : ' '+ -> skip; 
ANYUSER : A N Y '-' U S E R ; 
ANYTENANCY : A N Y '-' T E N A N C Y ; 
ENDORSE : E N D O R S E ; 
ALLOW : A L L O W; 
DEFINE : D E F I N E ; 
TO : T O; 
OF : O F ; 
IN : I N; 
WHERE : W H E R E ; 
WITH : W I T H ; 
DYNAMICGROUP : D Y N A M I C '-' G R O U P ; 
GROUP : G R O U P ; 
SERVICE : S E R V I C E ; 
COMPARTMENT : C O M P A R T M E N T ; 
TENANCY : T E N A N C Y; 
READ : R E A D ; 
INSPECT : I N S P E C T ; 
MANAGE : M A N A G E ; 
ASSOCIATE : A S S O C I A T E ; 
ADMIT : A D M I T ; 
USE : U S E ; 
ANY : A N Y ; 
AND : A N D; 
ALL : A L L ; 
AS : A S; 
ID : I D; 
WORD : (LETTER | DIGIT | '_' | '-' | '.' | ':'| '@')+ ; // Word is last to prevent ambiguity fragment 
LETTER : [a-zA-Z] ; 
fragment DIGIT : [0-9] ; 
fragment A : ('a'|'A') ; 
fragment L : ('l'|'L') ; 
fragment O : ('o'|'O') ; 
fragment W : ('w'|'W') ; 
fragment I : ('i'|'I') ; 
fragment N : ('n'|'N') ; 
fragment T : ('t'|'T') ; 
fragment E : ('e'|'E') ; 
fragment R : ('r'|'R') ; 
fragment H : ('h'|'H') ; 
fragment U : ('u'|'U') ; 
fragment P : ('p'|'P') ; 
fragment S : ('s'|'S') ; 
fragment V : ('v'|'V') ; 
fragment C : ('c'|'C') ; 
fragment D : ('d'|'D') ; 
fragment M : ('m'|'M') ; 
fragment G : ('g'|'G') ; 
fragment Y : ('y'|'Y') ; 
fragment F : ('f'|'F') ; 
fragment B : ('b'|'B');

 

Parser rules are used to generate a parser for the DSL that will consume the stream of tokens to produce a parse tree .

policy              : ( allowExpression | endorseExpression | defineExpression | admitExpression )+  EOF ;

allowExpression     : ALLOW subject (TO? verb resource | TO? permissionList) IN scope (WHERE condition)? NEWLINE?;
endorseExpression   : ENDORSE subject (TO endorseVerb resource | TO? permissionList) IN (endorseScope | (scope WITH resource IN endorseScope)) (WHERE condition)? NEWLINE?;
defineExpression    : DEFINE definedSubject AS defined NEWLINE?;
admitExpression     : ADMIT subject (OF endorseScope)?  (TO endorseVerb resource | TO? permissionList) IN scope (WITH resource IN endorseScope)? (WHERE condition)? NEWLINE?;

endorseVerb         : (verb | ASSOCIATE);
verb                : (INSPECT | READ | USE | MANAGE) ;
permissionList      : '{'  WORD  (',' WORD)* '}'  ; // e.g {USER_UPDATE, USER_UIPASS_SET, USER_UIPASS_SET}
scope               : ((COMPARTMENT ID?)  WORD (':' WORD)* | TENANCY) ;
endorseScope        : (ANYTENANCY| TENANCY WORD);
subject             : (groupSubject | serviceSubject | dynamicGroupSubject | ANYUSER) ;
groupSubject        : GROUP (groupName| groupID) (','(groupName|groupID))* ;
groupName           : (WORD | '\'' WORD '\'/\'' WORD '\'' | '\'' WORD '/' WORD '\'' |  '\'' WORD  '\''| WORD '/' WORD ); // NAME or 'WORD'/'WORD' or 'NAME' or 'WORD/WORD'
groupID             : ID WORD;
serviceSubject      : SERVICE WORD (',' WORD)*;
dynamicGroupSubject : DYNAMICGROUP ID? WORD ;
tenancySubject      : TENANCY WORD ;
definedSubject      : (groupSubject | dynamicGroupSubject | serviceSubject | tenancySubject);
defined             : WORD;
resource            : WORD;
condition           : (comparisonList | comparison) ;
comparison          : variable operator (value|valueList|timeWindow| patternMatch) ;
variable            : WORD (('.' WORD )+)? ;
operator            : ('=' | '!''=' | BEFORE | IN | BETWEEN) ;
value               : (WORD | '\'' WORD '\''| '\'' WORD '/' WORD '\'') ;
valueList           : '(' '\'' WORD '\'' ( ',' '\'' WORD '\'')*  ')';
timeWindow          : '\'' WORD '\'' AND '\'' WORD '\'';
comparisonList      : logicalCombine '{' condition  (',' condition)* '}' ;
logicalCombine      : ( ALL | ANY ) ;
patternMatch        : ('/' WORD '*/'|'/*' WORD '/'| '/' WORD '/') ;

 

Testing the grammar and generating the Lexer, Parser, Listener & visitor code

With the ANTLR grammar in hand, half the battle is won. We can now input the grammar into ANTLR to generate the lexer and parser that will take OCI policy statements and output a parse tree.

There are a few options for doing this:

  1. Using the ANTLR CLI
  2. Using the Gradle ANTLR plugin
  3. Using the IntelliJ ANTLR plugin
  4. Using the Visual Studio Code ANTLR plugin

Which option you choose will depend on your preferred development environment and tools.

I used the InteliJ plugin for ANTLR in my project, which made it easy to define and evaluate the grammar. It has a handy preview feature that allows me to select a file with policy samples and it will generate, among other perspectives, a parse tree visualization for the OCI IAM policy expressions as well as flag any parts of the input policy statements that it could not parse successfully. This enables me to immediately update my grammar and see the effect of the updated grammar on the produced parse tree.

For example, it will parse the following OCI IAM Policy expression:

ALLOW SERVICE ztb_hosts_test to MANAGE objects in COMPARTMENT ZtbInternal 
where 
any {request.permission = 'OBJECT_INSPECT'
,request.permission = 'OBJECT_READ'
,request.permission = 'OBJECT_CREATE'
,request.permission = 'OBJECT_OVERWRITE'}

and produce this parse tree visualization:

ANLT InteliJ plugin parse tree visualization

Once I'm satisfied that the grammar is correct, I can use the plugin to trigger the ANTLR code generation (I used Java)

As mentioned, in addition to the lexer and parser, ANTLR will also generate code that allows your program to traverse the parse tree. ANTLR provides a built-in tree walker implementation that will call back to a developer-provided listener whenever it arrives at or leaves a node in the parse tree. A visitor interface and visitor abstract base class are also generated, which allows a developer to more directly traverse the parse tree.

"The biggest difference between the listener and visitor mechanisms is that listener methods are called independently by an ANTLR-provided walker object, whereas visitor methods must walk their children with explicit visit calls. Forgetting to invoke visitor methods on a node's children, means those subtrees don't get visited"

https://github.com/antlr/antlr4/blob/master/doc/listeners.md

 

Test Policies

Here are some representative OCI IAM policy statements that I used to test the grammar.

##

# Allow expressions

#

ALLOW GROUP foo to manage all-resources in compartment foo:bar

allow group lz2-iam-admin-group to inspect users in tenancy

ALLOW SERVICE hosts_test to MANAGE objects in COMPARTMENT ZtbInternal where any {request.permission = 'OBJECT_INSPECT',request.permission = 'OBJECT_READ',request.permission = 'OBJECT_CREATE',request.permission = 'OBJECT_OVERWRITE'}

ALLOW SERVICE hosts_test to MANAGE objects in COMPARTMENT Internal where request.target.bucket = 'ct-test'

Allow dynamic-group DynamicGroupABC to manage object-family in tenancy

Allow group CANARY-USER to {USER_UPDATE, USER_UIPASS_SET} in compartment prod_canary

allow any-user {PAR_MANAGE} in compartment dis_prod_canary where ALL {request.principal.type='workspace'}

ALLOW any-user to manage dataflow-application in tenancy where ALL {request.principal.type = 'workspace'}

allow any-user {PAR_MANAGE} in tenancy where ALL {request.principal.type='workspace'}

# Group name and compartment name containing '-'

Allow group NewCIS-network-admin-group to read all-resources in compartment NewCIS-network-cmp

# Pattern mach examples

Allow any-user to use virtual-network-family in tenancy where any { request.permission !=/SUBNET*/,all{request.permission=/SUBNET*/,target.resource.tag.Operations.Customer=request.principal.group.tag.Operations.Customer}}

Allow group test to manage all-resources in compartment Test where all { request.permission!=/*_SUBNET/,request.permission!=/*_ROUTE/, request.permission!=/*_POLICY/ }

# list of service subjects , compartment name with '.' in name

allow service blockstorage, fssoc1prod to use keys in compartment foo.bar

#list of group name subjects

Allow group A-Admins , B-Admins to manage all-resources in compartment Projects-A-and-B

#list of group id subjects

Allow group id oicid , id ocid2 to manage all-resources in compartment Projects-A-and-B

# list of mixed group id and group name subjects. This expression the existing

# grammar will allow according to the following parse rule:

# groupSubject : GROUP (groupName| groupID) (','(groupName|groupID))* ;

# however it's not a valid #OCi IAM policy expression.

Allow group id oicid , B-Admins to manage all-resources in compartment Projects-A-and-B

#Nested conditions all{..,any{..}}

allow any-user to manage dataflow-application in tenancy where ALL {request.principal.type = 'workspace', any {FOO!=BAR, bar='BAR'}}

Allow group att_iam_admin to manage users in tenancy where all{any{request.operation = 'ListOAuthClientCredentials', request.operation = 'CreateOAuthClientCredential',request.operation = 'DeleteOAuthClientCredential', request.operation = 'UpdateOAuthClientCredential' }, any{ target.user.name = 'oracleidentitycloudservice/manasi.vaishampayan@oracle.com' } }

# BEFORE operator

Allow group Contractors to manage instance-family in tenancy where request.utc-timestamp before '2022-01-01T00:00Z'

#IN operator

Allow group SummerInterns to manage instance-family in tenancy where ANY {request.utc-timestamp.month-of-year in ('6', '7', '8')}

Allow group ComplianceAuditors to read all-resources in tenancy where request.utc-timestamp.day-of-month = '1'

#BETWEEN operator

Allow group DayShift to manage instance-family in tenancy where request.utc-timestamp.time-of-day between '17:00:00Z' and '01:00:00Z'

#dynamic group with permission list & no verb resource

allow DYNAMIC-GROUP logging_analytics_agent to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in tenancy

# Allow expression with compartment id

allow any-user to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in compartment id ocid1.tenancy.oc1..aaaaaaaabpm2vrjioztrjhwxysfkbekblv3jqbgyxq7itbbort45qpefdxxx where all {request.principal.type='serviceconnector', target.loganalytics-log-group.id='ocid1.loganalyticsloggroup.oc1.iad.amaaaaaazm7p2cyaeropw32wuathpseypdic5xfduexdafr7szvuussmxxx', request.principal.compartment.id='ocid1.tenancy.oc1..aaaaaaaabpm2vrjioztrjhwxysfkbekblv3jqbgyxq7itbbort45qpefdxxx'}

# group subject names

allow group 'domain'/'group_name' to manage all-resources in tenancy

Allow group 'foo/bar' to manage all-resources in tenancy

allow group foo/bar to manage all-resources in tenancy

##

# Define expressions

#

define tenancy boat as ocid1.tenancy.oc1..aaaaaaaagkbzgg6lpzrf47xzy4rjoxg4de6ncfiq2rncmjiujvy2jgxxx

define group fleetmanagement_grp as ocid1.group.oc1..aaaaaaaakhzvirthamdvas6hkvmh7eq7cbknygrcsjrsg2kadivi2gxxx

define dynamic-group ADBVAULTDG as ocid1.dynamicgroup.oc1..aaaaaaaan7pmegvo5vmv6aiql2ysllrvym3yrx4ngry4j2zihryx

#Admit expressions

admit group fleetmanagement_grp of tenancy boat to manage all-resources in compartment fleet_manager

admit group fleetmanagement_grp of tenancy boat to read all-resources in tenancy

admit dynamic-group ADBVAULTDG of tenancy foo to use vaults in tenancy

admit group ChildZoneGroup of tenancy ChildZoneTenancy to manage dns-records in tenancy where all {target.dns-zone.name = 'oci-lab.cloud', target.dns-record.type = 'NS', target.dns-domain.name = 'tenancy2.oci-lab.cloud'}

admit service blockstorage to manage vault in tenancy

admit group ChildZoneGroup of tenancy ChildZoneTenancy to associate dns-records in tenancy with dns-zones in tenancy ChildZoneTenancy

##

# Endorse expressions

#

Endorse dynamic-group DynamicGroupABC to manage object-family in tenancy DestinationTenancy

Endorse group CANARY-USER to manage object-family in tenancy DestinationTenancy

Endorse group Administrators to associate local-peering-gateways in tenancy with local-peering-gateways in any-tenancy

 

Conclusion

The grammar has proven to be accurate in my project and has been able to describe every valid IAM policy statement I have come across so far. However, as previously stated, the grammar may not be complete enough to identify all erroneous IAM policy statements.

 

Resources

Lexer: https://en.wikipedia.org/wiki/Lexical_analysis

Parser : https://en.wikipedia.org/wiki/Parsing#Parser

Parse Tree : https://en.wikipedia.org/wiki/Parse_tree

ANTLR home: https://www.antlr.org/

InteliJ ANTLR plugin home: https://plugins.jetbrains.com/plugin/7358-antlr-v4

 

Gordon Trevorrow

Principal Cloud Architect


Previous Post

Configuring Web Application Acceleration

Shawn Moore | 4 min read

Next Post


OCI Security Fundamentals Dashboards - Manage Logging Analytics Storage

Amine Tarhini | 6 min read