WO2015072979A1

WO2015072979A1 - Workflow change impact

Info

Publication number: WO2015072979A1
Application number: PCT/US2013/069782
Authority: WO
Inventors: Dejan S. Milojicic; Gustavo Ansaldi OLIVA; Virginia Smith
Original assignee: Hewlett-Packard Development Company, L.P.
Priority date: 2013-11-13
Filing date: 2013-11-13
Publication date: 2015-05-21

Abstract

Disclosed herein are a system, non-transitory computer readable medium, and method for determining the impact of a change to a workflow. The likelihood that a change will impact at least one other workflow is determined. A recommendation for the change is displayed.

Description

WORKFLOW CHANGE IMPACT

BACKGROUND

[0001 ] A workflow may comprise a series of tasks that collectively makeup a larger task or function. Large workflow repositories may include a series of complex interdependent workflows. Moreover, as workflows are changed and new ones are added, the number of interdependencies between them tends to increase.

BRIEF DESCRIPTION OF THE DRAWINGS

[0002] FIG. 1 is an example system in accordance with aspects of the disclosure.

[0003] FIG. 2 is a flow diagram in accordance with aspects of the disclosure.

[0004] FIG. 3 is a working example in accordance with aspects of the disclosure.

[0005] FIG. 4 is a further working example in accordance with aspects of the disclosure.

DETAILED DESCRIPTION

[0006] As noted above, interdependencies between workflows may increase over time as workflows are changed and new ones are added. Some workflows may be reused by many other larger workflows in order to reduce redundancy. The interdependencies between the workflows may be direct or indirect. Thus, altering a workflow may require additional precaution, since some changes may have unintended consequences to other directly or indirectly dependent workflows. Workflow changes made without impact analysis may render the entire workflow repository unreliable. Unfortunately, conventional change impact tools may be limited or may be very complex such that only specialists with advance training can interpret and use its findings.

[0007] In view of the foregoing, various examples disclosed herein provide a system, non transitory computer-readable medium and method to determine the impact of a workflow change. In one aspect, a request to change a workflow may be read and the likelihood that the change will impact at least one other workflow may be determined. In another example, a recommendation for the change may be displayed. In yet a further aspect, a graph data structure comprising a plurality of nodes may be analyzed. Each node in the graph may represent a workflow such that each link associating a given pair of nodes in the graph represents a relationship between a given pair of workflows. In yet a further example, the requested change may be categorized based at least partially on the likelihood of impact and the difference in the likelihood of impact to other workflows. Thus, the techniques disclosed herein may enable a user friendly interface for change impact analysis of workflows. That is, the use of a graph data structure may enable the impact analysis to be more understandable and accessible by humans. The aspects, features and advantages of the application will be appreciated when considered with reference to the following description of examples and accompanying figures. The following description does not limit the application; rather, the scope of the application is defined by the appended claims and equivalents.

[0008] FIG. 1 presents a schematic diagram of an illustrative computer apparatus 100 depicting various components in accordance with aspects of the present disclosure. The computer apparatus 100 may include all the components normally used in connection with a computer. For example, it may have a keyboard and mouse and/or various other types of input devices such as pen-inputs, joysticks, buttons, touch screens, etc., as well as a display, which could include, for instance, a CRT, LCD, plasma screen monitor, TV, projector, etc. Computer apparatus 100 may also comprise a network interface (not shown) to communicate with other devices over a network using conventional protocols (e.g. , Ethernet, Wi-Fi, Bluetooth, etc.). The computer apparatus 100 may also contain a processor 1 10, which may be any number of well known processors, such as processors from Intel® Corporation. In another example, processor 1 10 may be an application specific integrated circuit ("ASIC"). Non-transitory computer readable medium ("CRM") 1 12 may store instructions that may be retrieved and executed by processor 1 10. As will be discussed in more detail below, the instructions may include a change analyzer 1 14. Furthermore, non-transitory CRM 1 12 may include graph data structure 1 16, which may be read by processor 1 10. Non-transitory CRM 1 12 may be used by or in connection with any instruction execution system that can fetch or obtain the logic from non-transitory CRM 1 12 and execute the instructions contained therein.

[0009] Non-transitory CRM 1 12 may also comprise any one of many physical media such as, for example, electronic, magnetic, optical, electromagnetic, or semiconductor media. More specific examples of suitable non-transitory computer- readable media include, but are not limited to, a portable magnetic computer diskette such as floppy diskettes or hard drives, a read-only memory ("ROM"), an erasable programmable read-only memory, a portable compact disc or other storage devices that may be coupled to computer apparatus 100 directly or indirectly. Alternatively, non-transitory CRM 1 12 may be a random access memory ("RAM") device or may be divided into multiple memory segments organized as dual in-line memory modules ("DIMMs"). The non-transitory CRM 1 12 may also include any combination of one or more of the foregoing and/or other devices as well. While only one processor and one non-transitory CRM are shown in FIG. 1 , computer apparatus 100 may actually comprise additional processors and computer readable mediums that may or may not be stored within the same physical housing or location. For example, non transitory CRM 1 12 may be a hard drive or other storage media located in a server farm of a data center. Accordingly, references to a processor, computer, or computer readable medium will be understood to include references to a collection of processors, computers, or mediums that may or may not operate in parallel.

[0010] Change analyzer 1 14 may comprise any set of instructions to be executed directly (such as machine code) or indirectly (such as scripts) by processor 1 10. In this regard, the terms "instructions," "scripts," and "applications" may be used interchangeably herein. The computer executable instructions may be stored in any computer language or format, such as in object code or modules of source code. Furthermore, it is understood that the instructions may be implemented in the form of hardware, software, or a combination of hardware and software and that the examples herein are merely illustrative.

[0011 ] Although the architecture of graph data structure 1 16 is not limited by any particular data structure, the data may be stored in computer registers, in a relational database as a table having a plurality of different fields and records, XML documents or flat files. The graph data structure 1 16 may also be formatted in any computer- readable format. The data may comprise any information sufficient to identify the relevant information, such as numbers, descriptive text, proprietary codes, references to data stored in other areas of the same memory or different memories (including other network locations) or information that is used by a function to calculate the relevant data.

[0012] As will be discussed in more detail below, graph data structure 1 16 may comprise a plurality of nodes that may be analyzed to determine the impact of the change. In one example, each node in the graph may represent a workflow such that each link associating a given pair of nodes in the graph represents a relationship between a given pair of workflows. In one aspect, change analyzer 1 14 may instruct at least one processor to read a request for a change to a given workflow and to determine a likelihood that the change will impact at least one other workflow. Such determination may be based at least partially on an analysis of the graph data structure. In a further aspect, change analyzer 1 14 may instruct at least one processor to display a recommendation for the change based at least partially on the analysis.

[0013] Working examples of the system, method, and non-transitory computer- readable medium are shown in FIGS. 2-4. In particular, FIG. 2 illustrates a flow diagram of an example method 200 for determining the impact of a change to a workflow. FIGS. 3-4 each show a working example in accordance with the techniques disclosed herein. The actions shown in FIGS. 3-4 will be discussed below with regard to the flow diagram of FIG. 2.

[0014] Referring to FIG. 2, it may be determined whether a requested change will impact at least one other workflow, as shown in block 202. Referring now to FIG. 3, an example graph data structure 300 is shown. Each node in the graph (i.e. , nodes F1 , F2, F3, F4, F5, F6, F7, F8, and F12) may represent a workflow and each link associating a pair of nodes may represent a relationship between a given pair of workflows. In one example, the relationship formed between a given pair of workflows may be at least partially based on historical workflow development data. For example, workflow changes may be monitored as they occur and development patterns may be identified. For instance, it may be determined that every time a change is made to the workflow represented by node F2, a change is always made to the workflow represented by node F12. This illustrative pattern may form a dependency between node F2 and F12. In another example, the relationship may be at least partially based on a direct or indirect calling relationship between a workflow and at least one other workflow. By way of example, graph 300 is an example static call-graph that represents illustrative calling relationships between workflows. In the graph data structure 300, each directed arrow or link starting at a given workflow Fi and pointing to another workflow Fj may indicate that workflow Fi invokes workflow Fj (e.g., Fi has a subflow step that invokes Fj).

[0015] In the example of FIG. 3, change analyzer 304 attempts to analyze the impact of a change to workflow F12. In one example, change analyzer 304 may attempt to determine the likelihood that the change to workflow F12 will impact at least one other workflow. Change analyzer 304 may also determine the difference in the likelihood of impact among workflows due to the change. These determinations may be based at least partially on an analysis of the graph data structure. In one example, change analyzer 304 may begin its analysis by discovering all workflows that directly invoke workflow F12. In a further example, change analyzer 304 may delve deeper into the graph to discover workflows that may indirectly invoke workflow F12. In another example, change analyzer 304 may categorize the change based at least partially on the likelihood of impact and the difference in the likelihood of impact among workflows due to the change. In one example implementation, the number of workflows that are likely impacted due to a workflow change may be known as the "scattering metric". In another example implementation, the size of the subset of workflows that have a high chance of being impacted may be known as the "impact metric." Therefore, in this instance, the impact metric may be less than or equal to the scattering metric.

[0016] In the example of FIG. 3, the scattering metric for workflow F12 may be determined by recursively tracing the workflows that are directly or indirectly dependent on workflow F12. In this example, node F12 would have a scattering metric of eight, since F1 , F2, and F3 may be directly impacted by a change to workflow F12 (i.e., they directly invoke workflow F12) and five other nodes may be indirectly impacted due to their relationship to F1 , F2, and F3. For example, workflow F7 invokes workflow F4; in turn, workflow F4 invokes workflow F1 , which directly invokes workflow F12. Thus, a change to workflow F12 may impact workflows further removed from workflow F12 due to the chain of dependencies. Furthermore, workflow F2, which directly invokes workflow F12, is also invoked by workflow F5; in turn, workflow F5 is invoked by workflow F8 and F7. Finally, workflow F6 invokes workflow F3. Therefore, a change to workflow F12 can potentially impact every node in graph 300. In this example, considering that the graph depicts the whole repository, a scattering metric of eight would be deemed very high, since there are only nine workflows represented by the graph. In contrast, a change to workflow F6 would have a scattering metric of zero, since no workflows invoke workflow F6 directly or indirectly. The scattering metric indicates the number of workflows that may be impacted. In turn, the impact metric may indicate the size of the subset of possibly impacted workflows whose probability of being impacted is higher than or equal to a predefined threshold p. Once the scattering and impact metrics are calculated for every workflow, workflows may be categorized in accordance with the four example blocks illustrated in FIG. 4.

[0017] Calculating the impact metric of workflow Fi may rely on determining the likelihood of impact to a workflow Fj due to a change to workflow Fi. Referring back to FIG. 3, if workflow F12 is invoked in every possible execution path of workflow F2, the likelihood of impact to workflow F2 is higher. However, if workflow F12 is invoked in one of many possible execution paths in workflow F2, the likelihood of impact to workflow F2 is lower. The probability of workflow F5 being impacted by a change to workflow F12 may be determined based on the results of F2. This analysis may be performed recursively to every workflow in the chain.

[0018] An illustrative function calculatelmpact(F/, p) may return the quantity of workflows that have a probability higher than or equal to p of being impacted when workflow Fi is changed. The following example pseudo code may be used for the function calculatelmpact(F/, p):

Algorithm 1: calculatelmpac (Fi ,p)

01. chancesOfImpact <— createEmptyMap ( )

02. chancesOfImpact . put ( Fi , 1 )

03. callgraph <— Fi . getCallGraph ( )

04. topSort <— calcTopologicalSort (callGraph)

05. topSort . removeFirst ( )

06. for i from 0 to topSort. size do

07. Fj <- topSort [i]

08. chance <— calcChanceOfImpact (Fj , chancesOfImpact)

09. chancesOfImpact .put (Fj , chance)

10. end for

11. chancesOfImpact . remove (Fi)

12. impact <- number of entries from chancesOfImpact whose value is >= p 13. return impact;

Algorithm 2: calcChanceOfImpact (Fj , chancesOfImpact)

01. execPaths <— getExecutionPaths (Fj )

02. sumPathlmpact 0

03. for each execPath in execPaths do

04. pathlmpact <— calcPathlmpact (execPath, chancesOfImpact)

05. sumPathlmpact <— sumPathlmpact + pathlmpact

06. end for

07. avgPathlmpact <— sumPathlmpact / execPaths . size ( )

08. chanceOfImpact <— avgPathlmp

09. return chanceOfImpact

Algorithm 3 : calcPathlmpact (execPath, chancesOfImpact)

01. maxStepImpact 0

02. n <— execPath . numberOfSteps ( )

03. for i from 0 to n-1 do

04. step <— execPath [ i ]

05. if (chancesOfImpact . containsKey ( step . element ) ) then

06. positionCoef <— (n - 1 - i) / (n - 1)

07. chance <— chancesOfImpact .get (step .element)

08. steplmpact <— positionCoef * chance

09. if (steplmpact > maxStepImpact) then

10. maxStepImpact <— steplmpact

11. end if

12. end if

13. end for

14. pathlmpact <— maxStepImpact

15. return pathlmpact

[0019] The foregoing example pseudo code shows three illustrative algorithms working in conjunction. Generally, the example algorithm calculatelmpact(F/, p) ("first algorithm") may determine the order in which the potentially impacted workflows will be analyzed. For example, in order to calculate the probability of workflow F5 being impacted by a change to workflow F12, the probability of impact to F2 may be determined first, since workflow F5 invokes workflow F2. Therefore, in one example, the order may be based on the topological order of the graph. Given the example graph 300, one possible topological order may be: F12, F3, F6, F2, F5, F8, F1 , F4, and F7. The workflow being changed, in this case workflow F12, may be the first vertex in the topological order. The first algorithm may further determine the probability of impact for each workflow that directly or indirectly depends on Fi in topological order by invoking another example algorithm called calcChanceOflmpact ("second algorithm"). A discussion of the second algorithm is provided below. The first algorithm may then determine the number of possibly impacted workflows that have an impact probability higher than or equal to the threshold p. [0020] As noted above, the second algorithm may be used to determine the probability that a given workflow Fj would be impacted by a change to workflow Fi. The second algorithm may determine all possible execution paths of Fj. In particular, if Fj has n steps, then one valid execution path Q of Fj may be an ordered list of steps where Q[0] is a start step, Q[n-1 ] is an end step, and Q[/] is connected to Q[/+1 ] for 0 ≤ / < Obtaining all execution paths may be complicated if the workflow includes loops and parallel executions. In the event of a loop, a cycle may be included only once in the same path. As for parallel executions, each execution may be counted as a separate workflow and the one with the highest probability of impact may be considered. The probability that each execution path of Fj may invoke the workflow being changed Fi may be determined by invoking yet another example algorithm calcPathlmpact ("third algorithm"). A discussion of this third algorithm is provided below. The second algorithm may determine the average impact probability of all paths to determine the probability that the entire workflow Fj would be impacted by a change to Fi.

[0021 ] The third algorithm may be used to determine the probability of impact to a path within a workflow. The third algorithm may analyze each step in the workflow to determine if it directly or indirectly invokes another workflow. In particular, the third algorithm may determine if the workflow being invoked is Fi (the workflow being changed) or if the workflow being invoked eventually leads to Fi being invoked. In one example, the probability of impact to a step in the workflow may be based on the step's position in the execution path. Steps that occur earlier may receive a higher impact probability, while steps that occur later in the path may receive a lower impact probability. This approach assumes that the probability of impact to a workflow Fj by a change to workflow Fi is greater when Fj invokes Fi early in its execution. This assumption is based on the notion that all subsequent steps of Fj may be susceptible to erroneous behavior by Fi, since the proper behavior of these subsequent steps may depend on some outcome determined by Fi. In an extreme case, the first step in Fj would invoke Fi. In this instance, the probability of impact may be very high.

[0022] Returning now to FIG. 2, a recommendation may be displayed based at least partially on the analysis, as shown in block 204. Referring now to FIG. 4, an example coordinate system is shown to illustrate the categorization of a workflow change. In this example, the x axis may represent the scattering metric of a change (e.g., how many workflows may be impacted) and the y axis may represent the impact metric of a change (e.g., how many of these workflows have a probability higher than or equal to p of being impacted). The coordinate system may be divided into four categories and the threshold of each category may be determined based on specific factors including, but not limited to, the size of the repository or the size of the workflow dependency chain. Category 404 may indicate a high scattering metric and a high impact metric. If a workflow change falls within this category, the recommendation displayed to a user may comprise a strong cautionary statement, since the change could potentially impact many workflows whose probability of being impacted is greater than or equal to p. In another example, the recommendation may prompt the user to check the high risk workflows and may provide a link to those workflows. In contrast, a change falling within category 406 may indicate that the change has a low scattering and low impact metric. In this instance, the risk is low and a recommendation may simply tell the user that the change is safe. Category 402 may indicate that the scattering metric is low, but the impact metric is high. That is, while not many workflows are directly or indirectly dependent on the workflow being changed, a high percentage of the potentially impacted workflows have a probability of being impacted greater than or equal to p. Here, the recommendation may display the high risk workflows. Category 408 may indicate a high scattering metric, but a low impact metric. That is, there are many workflows that directly or indirectly invoke the workflow in question, but a low percentage of those have a probability of being impacted that is higher than or equal to p. Here, the high risk workflows may also be displayed to the user.

[0023] Advantageously, the above-described apparatus and method provides a user friendly interface for change impact analysis. In this regard, a graph data structure may be used to analyze the impact of a change to a workflow repository due to a change such that each node in the structure represents one workflow. In turn, the graph data analysis enables user friendly recommendations to be provided to workflow developers as they make changes.

[0024] Although the disclosure herein has been described with reference to particular examples, it is to be understood that these examples are merely illustrative of the principles of the disclosure. It is therefore to be understood that numerous modifications may be made to the examples and that other arrangements may be devised without departing from the spirit and scope of the disclosure as defined by the appended claims. Furthermore, while particular processes are shown in a specific order in the appended drawings, such processes are not limited to any particular order unless such order is expressly set forth herein. Rather, processes may be performed in a different order or concurrently.

Claims

1 . A system comprising:

a graph data structure comprising a plurality of nodes, each node in the graph representing a workflow such that each link associating a given pair of nodes in the graph represents a relationship between a given pair of workflows;

a change analyzer which upon execution instructs at least one processor to:

read a request for a change to a given workflow;

determine a likelihood that the change will impact at least one other workflow based at least partially on an analysis of the graph data structure; and

display a recommendation for the change based at least partially on the analysis.

2. The system of claim 1 , wherein the relationship between the given pair of workflows is based at least partially on a direct or indirect calling relationship between the given pair of workflows.

3. The system of claim 1 , wherein the relationship formed between the given pair of workflows is based at least partially on historical workflow development data.

4. The system of claim 1 , wherein the analysis of the graph comprises a difference in a likelihood of impact among workflows due to the change to the given workflow.

5. The system of claim 4, wherein the change analyzer upon execution further instructs at least one processor to categorize the change based at least partially on the likelihood of impact and the difference in the likelihood of impact.

6. A non-transitory computer readable medium having instructions therein which upon execution cause at least one processor to:

read a request for a change to a given workflow among a plurality of workflows;

analyze a graph data structure comprising a node representing the given workflow and at least one other node representing at least one different workflow;

determine whether the change will impact the at least one different workflow based at least partially on an analysis of the node and the at least one other node; and

7. The non transitory computer readable medium of claim 6, wherein to determine the impact of the change the instructions therein upon execution further instruct at least one processor to determine whether the node of the given workflow has a direct or indirect calling relationship with at least one other node representing at least one other workflow.

8. The non transitory computer readable medium of claim 6, wherein to determine the impact of the change the instructions therein upon execution further instruct at least one processor to determine a relationship between the given workflow and the at least one different workflow based at least partially on historical workflow development data.

9. The non transitory computer readable medium of claim 6, wherein the analysis comprises a difference in a likelihood of impact among workflows due to the change.

10. The non transitory computer readable medium of claim 9, wherein the instructions therein upon execution further instruct at least one processor to categorize the change based at least partially on the likelihood of impact and the difference in the likelihood of impact among workflows due to the change.

1 1 . A method comprising

reading, using at least one processor, a request for a change to a workflow;

categorizing, using at least one processor, the change in accordance with a likelihood of impact that the change would have to at least one other workflow; and

displaying, using at least one processor, a recommendation for the change based on a categorization of the change.

12. The method of claim 1 1 , wherein categorizing the change further comprises analyzing, using at least one processor, a graph data structure comprising a plurality of nodes, each node in the graph representing a given workflow.

13. The method of claim 12, wherein categorizing the change further comprises determining, using at least one processor, whether a node representing the workflow to be changed has a direct or indirect calling relationship with at least one other node representing at least one other workflow.

14. The method of claim 12, wherein categorizing the change further comprises determining, using at least one processor, whether the workflow to be changed has a relationship with at least one other workflow based at least partially on historical workflow development data.

15. The method of claim 1 1 , wherein categorizing the change is further based on a difference in the likelihood of impact among workflows due to the change.