4

Is there an efficient approach to discovering what is extraneous ("dead" or "unused") code in OpenSCAD to promote the learning process?

I am making progress by using working examples of code; however, some contain code with lots of modules and "test cases" (like gear generators) that aren't always used or referenced. These are great utilities but in trying to simplify the file to do only a small part of what is possible (usually by commenting out what I think are unused lines of code) things (to me) get very unpredictable. Reviewing the "AST Dump" has helped some but I'm hoping there is a more practical approach to isolating only the bare essential code.

  • This is a totally useless answer from a practical standpoint, but [it is impossible](https://en.wikipedia.org/wiki/Rice%27s_theorem) to determine whether code is unused or not. – Tom van der Zanden Aug 29 '19 at 20:06
  • @TomvanderZanden: [it is impossible](https://en.wikipedia.org/wiki/Dead_code_elimination) only in theory (i.e. for all possible algorithms, or whatever). One approach might be to start from an empty file and copy the bits you want, and then add more in response to errors. – Tomas By Aug 29 '19 at 20:39
  • Thanks, Tomas. With the more complex files I've encountered that have modules that references other modules and uses variable 'declared' in hidden parts of the lengthy code, it is more practical to start by commenting out suspected "extraneous" code. I was hoping there is a more efficient way of doing this - besides commenting out section by section to see what happens to the model. Though I can appreciate why it could be said to be impossible, I respectfully disagree entirely with that assessment. "Impractical"? Maybe, but that depends on each example. NOT impossible! – JoeVanGeaux Aug 29 '19 at 20:57
  • @TomvanderZanden: Rice's Theorem has nothing to do with whether code is used, just whether it's reached during execution. In most languages, for the program to be valid and identifier that's referenced must have a definition somewhere in the source regardless of whether it will be reached at runtime, so "used" is purely a static property amenable to trivial static analysis. – R.. GitHub STOP HELPING ICE Aug 30 '19 at 01:36
  • 1
    @R.. The question asked about "extraneous (dead or unused)" code. I interpreted that more broadly, to include unreachable code (which is commonly called "dead code"). – Tom van der Zanden Aug 30 '19 at 05:49
  • @TomvanderZanden: I suppose that's a valid interpretation, but removing unused code in that sense requires invasive changes to the source that would probably be undesirable. In any case, Rice's Theorem still doesn't apply to a particular model, although it may apply to a parametrized one, since with a fixed input for which the program (presumably) terminates reachability is computable just by tracing. – R.. GitHub STOP HELPING ICE Aug 30 '19 at 13:10
  • @R, @TomvanderZanden, thanks. I attempted to avoid semantics I thought would misrepresent my original pursuit. Take, for example, a case where an "If" statement results in bypassing a large part of code, then that code is effectively "dead" or "unused" in the sense that I can 'comment out' that code and effectively streamline that code for easier reading. This would be useful if I never really intend to take take any optional path like if I were always only generating one type of object (in my case, a simple involute gear or if I wanted -but I don't- a complex series of reduction gears). – JoeVanGeaux Aug 31 '19 at 04:01
  • Are you considering only the parts executed, or also the parts that contribute to the object? Consider code that creates some geometry, and the unions it with a solid cube that contains that geometry. The first code now contributes no geometric complexity. Is it "used"? It could be simplified or removed. – cmm Sep 12 '19 at 11:12

1 Answers1

1

I have written compilers and optimizers, as well as optimizers which work on raster operations. I so want to offer you a beautiful solution, but I don't know of one.

If I really wanted to solve this, I would start with this approach:

First, identify each token and expression in the OpenSCAD code. Assign each a unique identifier, and reserve one bit of storage, initially zeroed, that will be used later.

At the geometric database level, tag each geometric element with the identifier in the source code that generates it. Intersections become labeled with both operands and the intersection operator. Same with unions and other operations.

Then render the OpenSCAD form into some representation, perhaps voxels, perhaps STL. Voxels have a resolution limit but are intrinsically the simplest form. STLs might require an optimization pass to find redundant edges -- although that may already be part of STL generation process.

Now, go through the tagged representation voxel by voxel or triangle by triangle. Set the bit reserved with every token and expression to "one" for every identifier associated with that voxel or trianble.

Finally, check the bits on each of the source tokens and expressions. If the bit is zero, that particular element contributed nothing to the result. If the bit is one, that is needed in the result.

This is oversimplified because one source element can be used several times. Some may contribute to the output, and some may not. A fully expanded version of the source should be used, and we have to invent notation for reporting that instance 1 and 3 is used but 2 is not. I'm sure there are other over-simplifications which someone reading this will immediately realize.

I think, though, that this would be a path forward to the OpenSCAD optimizer you want.

In the meantime, OpenSCAD provides some prefix characters. One of the, maybe "*", removes a sub-tree from the generated geometry without messing up the parsing. I frequently use that mark to find out if code is used and to what it contributes.

cmm
  • 4,418
  • 10
  • 36