Teams working with OCI may sometimes need the ability to programmatically analyze, modify, and create IAM policy statements. For example, teams might want to validate new IAM policies against existing ones to detect redundant policy statements, assess user access, and generate reports. They might also want to implement IAM policy meta policies (policies about policies) to control the content of valid policies within an OCI account, or make bulk changes to existing policies.
To achieve this quickly and efficiently, they need a way to generate a parse tree from IAM policy statements. A parse tree is a data structure that represents the syntactic structure of a policy statement. By having a parse tree, they can computationally scrutinize each part of a policy declaration. This article will go into detail on how to create a lexer and parser using ANTLR that can generate such an IAM policy parse tree.
ANTLR (ANother Tool for Language Recognition) is a tool designed for developers who want to construct parsers that can recognize the structure of computer languages. To use ANTLR, you must create a grammar. A grammar is a set of rules expressed in a textual form that defines the syntax of the language. From a grammar definition, ANTLR can then generate a lexer and parser for the language. The lexer and parser are generated in a programming language that ANTLR supports, such as Java, Python, C#, and so on. In addition, ANTLR will generate listeners and visitors that allow a program to "walk" the parse tree and perform actions based on identified language structures.
As mentioned, to get ANTLR to generate a parser for the OCI IAM Policy language, we start by creating an ANTLR grammar that contains the lexical and syntactic ( parser) rules of the OCI IAM policy Domain Specific Langauge (DSL).
ANTLR uses the lexical rules to generate a tokenizer/lexer that can scan an OCI IAM policy statement and produce a stream of language tokens.Additionally, the grammar includes ANTLR fragments, which are reusable elements of the lexer rules, but that do not form tokens of the DSL on their own.
/* * Lexer Rules */ BEFORE : B E F O R E ; BETWEEN : B E T W E E N; NEWLINE : ('\r'? '\n' | '\r')+ -> skip; WS : ' '+ -> skip; ANYUSER : A N Y '-' U S E R ; ANYTENANCY : A N Y '-' T E N A N C Y ; ENDORSE : E N D O R S E ; ALLOW : A L L O W; DEFINE : D E F I N E ; TO : T O; OF : O F ; IN : I N; WHERE : W H E R E ; WITH : W I T H ; DYNAMICGROUP : D Y N A M I C '-' G R O U P ; GROUP : G R O U P ; SERVICE : S E R V I C E ; COMPARTMENT : C O M P A R T M E N T ; TENANCY : T E N A N C Y; READ : R E A D ; INSPECT : I N S P E C T ; MANAGE : M A N A G E ; ASSOCIATE : A S S O C I A T E ; ADMIT : A D M I T ; USE : U S E ; ANY : A N Y ; AND : A N D; ALL : A L L ; AS : A S; ID : I D; WORD : (LETTER | DIGIT | '_' | '-' | '.' | ':'| '@')+ ; // Word is last to prevent ambiguity fragment LETTER : [a-zA-Z] ; fragment DIGIT : [0-9] ; fragment A : ('a'|'A') ; fragment L : ('l'|'L') ; fragment O : ('o'|'O') ; fragment W : ('w'|'W') ; fragment I : ('i'|'I') ; fragment N : ('n'|'N') ; fragment T : ('t'|'T') ; fragment E : ('e'|'E') ; fragment R : ('r'|'R') ; fragment H : ('h'|'H') ; fragment U : ('u'|'U') ; fragment P : ('p'|'P') ; fragment S : ('s'|'S') ; fragment V : ('v'|'V') ; fragment C : ('c'|'C') ; fragment D : ('d'|'D') ; fragment M : ('m'|'M') ; fragment G : ('g'|'G') ; fragment Y : ('y'|'Y') ; fragment F : ('f'|'F') ; fragment B : ('b'|'B');
Parser rules are used to generate a parser for the DSL that will consume the stream of tokens to produce a parse tree .
policy : ( allowExpression | endorseExpression | defineExpression | admitExpression )+ EOF ; allowExpression : ALLOW subject (TO? verb resource | TO? permissionList) IN scope (WHERE condition)? NEWLINE?; endorseExpression : ENDORSE subject (TO endorseVerb resource | TO? permissionList) IN (endorseScope | (scope WITH resource IN endorseScope)) (WHERE condition)? NEWLINE?; defineExpression : DEFINE definedSubject AS defined NEWLINE?; admitExpression : ADMIT subject (OF endorseScope)? (TO endorseVerb resource | TO? permissionList) IN scope (WITH resource IN endorseScope)? (WHERE condition)? NEWLINE?; endorseVerb : (verb | ASSOCIATE); verb : (INSPECT | READ | USE | MANAGE) ; permissionList : '{' WORD (',' WORD)* '}' ; // e.g {USER_UPDATE, USER_UIPASS_SET, USER_UIPASS_SET} scope : ((COMPARTMENT ID?) WORD (':' WORD)* | TENANCY) ; endorseScope : (ANYTENANCY| TENANCY WORD); subject : (groupSubject | serviceSubject | dynamicGroupSubject | ANYUSER) ; groupSubject : GROUP (groupName| groupID) (','(groupName|groupID))* ; groupName : (WORD | '\'' WORD '\'/\'' WORD '\'' | '\'' WORD '/' WORD '\'' | '\'' WORD '\''| WORD '/' WORD ); // NAME or 'WORD'/'WORD' or 'NAME' or 'WORD/WORD' groupID : ID WORD; serviceSubject : SERVICE WORD (',' WORD)*; dynamicGroupSubject : DYNAMICGROUP ID? WORD ; tenancySubject : TENANCY WORD ; definedSubject : (groupSubject | dynamicGroupSubject | serviceSubject | tenancySubject); defined : WORD; resource : WORD; condition : (comparisonList | comparison) ; comparison : variable operator (value|valueList|timeWindow| patternMatch) ; variable : WORD (('.' WORD )+)? ; operator : ('=' | '!''=' | BEFORE | IN | BETWEEN) ; value : (WORD | '\'' WORD '\''| '\'' WORD '/' WORD '\'') ; valueList : '(' '\'' WORD '\'' ( ',' '\'' WORD '\'')* ')'; timeWindow : '\'' WORD '\'' AND '\'' WORD '\''; comparisonList : logicalCombine '{' condition (',' condition)* '}' ; logicalCombine : ( ALL | ANY ) ; patternMatch : ('/' WORD '*/'|'/*' WORD '/'| '/' WORD '/') ;
With the ANTLR grammar in hand, half the battle is won. We can now input the grammar into ANTLR to generate the lexer and parser that will take OCI policy statements and output a parse tree.
There are a few options for doing this:
Which option you choose will depend on your preferred development environment and tools.
I used the InteliJ plugin for ANTLR in my project, which made it easy to define and evaluate the grammar. It has a handy preview feature that allows me to select a file with policy samples and it will generate, among other perspectives, a parse tree visualization for the OCI IAM policy expressions as well as flag any parts of the input policy statements that it could not parse successfully. This enables me to immediately update my grammar and see the effect of the updated grammar on the produced parse tree.
For example, it will parse the following OCI IAM Policy expression:
ALLOW SERVICE ztb_hosts_test to MANAGE objects in COMPARTMENT ZtbInternal where any {request.permission = 'OBJECT_INSPECT' ,request.permission = 'OBJECT_READ' ,request.permission = 'OBJECT_CREATE' ,request.permission = 'OBJECT_OVERWRITE'}
and produce this parse tree visualization:
Once I'm satisfied that the grammar is correct, I can use the plugin to trigger the ANTLR code generation (I used Java)
As mentioned, in addition to the lexer and parser, ANTLR will also generate code that allows your program to traverse the parse tree. ANTLR provides a built-in tree walker implementation that will call back to a developer-provided listener whenever it arrives at or leaves a node in the parse tree. A visitor interface and visitor abstract base class are also generated, which allows a developer to more directly traverse the parse tree.
"The biggest difference between the listener and visitor mechanisms is that listener methods are called independently by an ANTLR-provided walker object, whereas visitor methods must walk their children with explicit visit calls. Forgetting to invoke visitor methods on a node's children, means those subtrees don't get visited"
https://github.com/antlr/antlr4/blob/master/doc/listeners.md
Here are some representative OCI IAM policy statements that I used to test the grammar.
##
# Allow expressions
#
ALLOW GROUP foo to manage all-resources in compartment foo:bar
allow group lz2-iam-admin-group to inspect users in tenancy
ALLOW SERVICE hosts_test to MANAGE objects in COMPARTMENT ZtbInternal where any {request.permission = 'OBJECT_INSPECT',request.permission = 'OBJECT_READ',request.permission = 'OBJECT_CREATE',request.permission = 'OBJECT_OVERWRITE'}
ALLOW SERVICE hosts_test to MANAGE objects in COMPARTMENT Internal where request.target.bucket = 'ct-test'
Allow dynamic-group DynamicGroupABC to manage object-family in tenancy
Allow group CANARY-USER to {USER_UPDATE, USER_UIPASS_SET} in compartment prod_canary
allow any-user {PAR_MANAGE} in compartment dis_prod_canary where ALL {request.principal.type='workspace'}
ALLOW any-user to manage dataflow-application in tenancy where ALL {request.principal.type = 'workspace'}
allow any-user {PAR_MANAGE} in tenancy where ALL {request.principal.type='workspace'}
# Group name and compartment name containing '-'
Allow group NewCIS-network-admin-group to read all-resources in compartment NewCIS-network-cmp
# Pattern mach examples
Allow any-user to use virtual-network-family in tenancy where any { request.permission !=/SUBNET*/,all{request.permission=/SUBNET*/,target.resource.tag.Operations.Customer=request.principal.group.tag.Operations.Customer}}
Allow group test to manage all-resources in compartment Test where all { request.permission!=/*_SUBNET/,request.permission!=/*_ROUTE/, request.permission!=/*_POLICY/ }
# list of service subjects , compartment name with '.' in name
allow service blockstorage, fssoc1prod to use keys in compartment foo.bar
#list of group name subjects
Allow group A-Admins , B-Admins to manage all-resources in compartment Projects-A-and-B
#list of group id subjects
Allow group id oicid , id ocid2 to manage all-resources in compartment Projects-A-and-B
# list of mixed group id and group name subjects. This expression the existing
# grammar will allow according to the following parse rule:
# groupSubject : GROUP (groupName| groupID) (','(groupName|groupID))* ;
# however it's not a valid #OCi IAM policy expression.
Allow group id oicid , B-Admins to manage all-resources in compartment Projects-A-and-B
#Nested conditions all{..,any{..}}
allow any-user to manage dataflow-application in tenancy where ALL {request.principal.type = 'workspace', any {FOO!=BAR, bar='BAR'}}
Allow group att_iam_admin to manage users in tenancy where all{any{request.operation = 'ListOAuthClientCredentials', request.operation = 'CreateOAuthClientCredential',request.operation = 'DeleteOAuthClientCredential', request.operation = 'UpdateOAuthClientCredential' }, any{ target.user.name = 'oracleidentitycloudservice/manasi.vaishampayan@oracle.com' } }
# BEFORE operator
Allow group Contractors to manage instance-family in tenancy where request.utc-timestamp before '2022-01-01T00:00Z'
#IN operator
Allow group SummerInterns to manage instance-family in tenancy where ANY {request.utc-timestamp.month-of-year in ('6', '7', '8')}
Allow group ComplianceAuditors to read all-resources in tenancy where request.utc-timestamp.day-of-month = '1'
#BETWEEN operator
Allow group DayShift to manage instance-family in tenancy where request.utc-timestamp.time-of-day between '17:00:00Z' and '01:00:00Z'
#dynamic group with permission list & no verb resource
allow DYNAMIC-GROUP logging_analytics_agent to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in tenancy
# Allow expression with compartment id
allow any-user to {LOG_ANALYTICS_LOG_GROUP_UPLOAD_LOGS} in compartment id ocid1.tenancy.oc1..aaaaaaaabpm2vrjioztrjhwxysfkbekblv3jqbgyxq7itbbort45qpefdxxx where all {request.principal.type='serviceconnector', target.loganalytics-log-group.id='ocid1.loganalyticsloggroup.oc1.iad.amaaaaaazm7p2cyaeropw32wuathpseypdic5xfduexdafr7szvuussmxxx', request.principal.compartment.id='ocid1.tenancy.oc1..aaaaaaaabpm2vrjioztrjhwxysfkbekblv3jqbgyxq7itbbort45qpefdxxx'}
# group subject names
allow group 'domain'/'group_name' to manage all-resources in tenancy
Allow group 'foo/bar' to manage all-resources in tenancy
allow group foo/bar to manage all-resources in tenancy
##
# Define expressions
#
define tenancy boat as ocid1.tenancy.oc1..aaaaaaaagkbzgg6lpzrf47xzy4rjoxg4de6ncfiq2rncmjiujvy2jgxxx
define group fleetmanagement_grp as ocid1.group.oc1..aaaaaaaakhzvirthamdvas6hkvmh7eq7cbknygrcsjrsg2kadivi2gxxx
define dynamic-group ADBVAULTDG as ocid1.dynamicgroup.oc1..aaaaaaaan7pmegvo5vmv6aiql2ysllrvym3yrx4ngry4j2zihryx
#Admit expressions
admit group fleetmanagement_grp of tenancy boat to manage all-resources in compartment fleet_manager
admit group fleetmanagement_grp of tenancy boat to read all-resources in tenancy
admit dynamic-group ADBVAULTDG of tenancy foo to use vaults in tenancy
admit group ChildZoneGroup of tenancy ChildZoneTenancy to manage dns-records in tenancy where all {target.dns-zone.name = 'oci-lab.cloud', target.dns-record.type = 'NS', target.dns-domain.name = 'tenancy2.oci-lab.cloud'}
admit service blockstorage to manage vault in tenancy
admit group ChildZoneGroup of tenancy ChildZoneTenancy to associate dns-records in tenancy with dns-zones in tenancy ChildZoneTenancy
##
# Endorse expressions
#
Endorse dynamic-group DynamicGroupABC to manage object-family in tenancy DestinationTenancy
Endorse group CANARY-USER to manage object-family in tenancy DestinationTenancy
Endorse group Administrators to associate local-peering-gateways in tenancy with local-peering-gateways in any-tenancy
The grammar has proven to be accurate in my project and has been able to describe every valid IAM policy statement I have come across so far. However, as previously stated, the grammar may not be complete enough to identify all erroneous IAM policy statements.
Lexer: https://en.wikipedia.org/wiki/Lexical_analysis
Parser : https://en.wikipedia.org/wiki/Parsing#Parser
Parse Tree : https://en.wikipedia.org/wiki/Parse_tree
ANTLR home: https://www.antlr.org/
InteliJ ANTLR plugin home: https://plugins.jetbrains.com/plugin/7358-antlr-v4
Next Post