Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/tools/internal/diff: maxDiffs is too small #71648

Open
adonovan opened this issue Feb 10, 2025 · 4 comments
Open

x/tools/internal/diff: maxDiffs is too small #71648

adonovan opened this issue Feb 10, 2025 · 4 comments
Assignees
Labels
Refactoring Issues related to refactoring tools ToolProposal Issues describing a requested change to a Go tool or command-line program.
Milestone

Comments

@adonovan
Copy link
Member

adonovan commented Feb 10, 2025

Here are two diffs produced by the internal/diff package (via modernize -fix -diff) using different values (1000 and 100) of the maxDiffs constant. The first is very precise; the second is very rough. However, the first costs (in theory) up to 100x more memory.

What value of maxDiffs should we use? Can we afford 1000? Will that break down for larger files? What do state-of-the-art diff tools do?

@pjweinb

(Aside: the first diff would be even better if adjacent insertions and deletions were preserved rather than interleaved. This may be a consequence of the unified diff conversion, which expands byte-oriented diffs to whole lines but perhaps misses an opportunity to coalesce hunks that, as a result, now abut each other. Not sure.)

maxDiffs=1000

--- /Users/adonovan/w/xtools/internal/refactor/inline/callee.go (old)
+++ /Users/adonovan/w/xtools/internal/refactor/inline/callee.go (new)
@@ -6,7 +6,9 @@
 
 // This file defines the analysis of the callee function.
 
-import (
+import "slices"
+
+import (
 	"bytes"
 	"encoding/gob"
 	"fmt"
@@ -303,7 +305,7 @@
 						return nil, tuple.At(i).Type()
 					}
 				}
-				for i := 0; i < sig.Results().Len(); i++ {
+				for i := range sig.Results().Len() {
 					expr, typ := argInfo(i)
 					var flags returnOperandFlags
 					if typ == types.Typ[types.UntypedNil] { // untyped nil is preserved by go/types
@@ -437,10 +439,10 @@
 		if sig.Recv() != nil {
 			params = append(params, newParamInfo(sig.Recv(), false))
 		}
-		for i := 0; i < sig.Params().Len(); i++ {
+		for i := range sig.Params().Len() {
 			params = append(params, newParamInfo(sig.Params().At(i), false))
 		}
-		for i := 0; i < sig.Results().Len(); i++ {
+		for i := range sig.Results().Len() {
 			results = append(results, newParamInfo(sig.Results().At(i), true))
 		}
 	}
@@ -572,11 +572,9 @@
  
 	// Types do not need to match for an initializer with known type.
 	if spec, ok := parent.(*ast.ValueSpec); ok && spec.Type != nil {
-		for _, v := range spec.Values {
-			if v == expr {
+		if slices.Contains(spec.Values, expr) {
-				typ := info.TypeOf(spec.Type)
+			typ := info.TypeOf(spec.Type)
-				return true, typ == nil || types.IsInterface(typ), false
-			}
+			return true, typ == nil || types.IsInterface(typ), false
 		}
 	}
 
@@ -616,7 +614,7 @@
 				return true, types.IsInterface(under.Elem()), false
 			case *types.Struct: // Struct{k: expr}
 				if id, _ := kv.Key.(*ast.Ident); id != nil {
-					for fi := 0; fi < under.NumFields(); fi++ {
+					for fi := range under.NumFields() {
 						field := under.Field(fi)
 						if info.Uses[id] == field {
 							return true, types.IsInterface(field.Type()), false

maxDiffs=100 (current default)

--- /Users/adonovan/w/xtools/internal/refactor/inline/callee.go (old)
+++ /Users/adonovan/w/xtools/internal/refactor/inline/callee.go (new)
@@ -6,7 +6,9 @@
 
 // This file defines the analysis of the callee function.
 
-import (
+import "slices"
+
+import (
 	"bytes"
 	"encoding/gob"
 	"fmt"
@@ -303,7 +305,7 @@
 						return nil, tuple.At(i).Type()
 					}
 				}
-				for i := 0; i < sig.Results().Len(); i++ {
+				for i := range sig.Results().Len() {
 					expr, typ := argInfo(i)
 					var flags returnOperandFlags
 					if typ == types.Typ[types.UntypedNil] { // untyped nil is preserved by go/types
@@ -437,146 +439,144 @@
 		if sig.Recv() != nil {
 			params = append(params, newParamInfo(sig.Recv(), false))
 		}
-		for i := 0; i < sig.Params().Len(); i++ {
+		for i := range sig.Params().Len() {
 			params = append(params, newParamInfo(sig.Params().At(i), false))
 		}
-		for i := 0; i < sig.Results().Len(); i++ {
-			results = append(results, newParamInfo(sig.Results().At(i), true))
-		}
-	}
-
-	// Search function body for operations &x, x.f(), and x = y
-	// where x is a parameter, and record it.
-	escape(info, decl, func(v *types.Var, escapes bool) {
-		if info := paramInfos[v]; info != nil {
-			if escapes {
-				info.Escapes = true
-			} else {
-				info.Assigned = true
-			}
-		}
-	})
-
-	// Record locations of all references to parameters.
-	// And record the set of intervening definitions for each parameter.
-	//
-	// TODO(adonovan): combine this traversal with the one that computes
-	// FreeRefs. The tricky part is that calleefx needs this one first.
-	fieldObjs := fieldObjs(sig)
-	var stack []ast.Node
-	stack = append(stack, decl.Type) // for scope of function itself
-	ast.Inspect(decl.Body, func(n ast.Node) bool {
-		if n != nil {
-			stack = append(stack, n) // push
-		} else {
-			stack = stack[:len(stack)-1] // pop
-		}
-
-		if id, ok := n.(*ast.Ident); ok {
-			if v, ok := info.Uses[id].(*types.Var); ok {
-				if pinfo, ok := paramInfos[v]; ok {
-					// Record ref information, and any intervening (shadowing) names.
-					//
-					// If the parameter v has an interface type, and the reference id
-					// appears in a context where assignability rules apply, there may be
-					// an implicit interface-to-interface widening. In that case it is
-					// not necessary to insert an explicit conversion from the argument
-					// to the parameter's type.
-					//
-					// Contrapositively, if param is not an interface type, then the
-					// assignment may lose type information, for example in the case that
-					// the substituted expression is an untyped constant or unnamed type.
-					assignable, ifaceAssign, affectsInference := analyzeAssignment(info, stack)
-					ref := refInfo{
-						Offset:             int(n.Pos() - decl.Pos()),
-						Assignable:         assignable,
-						IfaceAssignment:    ifaceAssign,
-						AffectsInference:   affectsInference,
-						IsSelectionOperand: isSelectionOperand(stack),
-					}
-					pinfo.Refs = append(pinfo.Refs, ref)
-					pinfo.Shadow = pinfo.Shadow.add(info, fieldObjs, pinfo.Name, stack)
-				}
-			}
-		}
-		return true
-	})
-
-	// Compute subset and order of parameters that are strictly evaluated.
-	// (Depends on Refs computed above.)
-	effects = calleefx(info, decl.Body, paramInfos)
-	logf("effects list = %v", effects)
-
-	falcon := falcon(logf, fset, paramInfos, info, decl)
-
-	return params, results, effects, falcon
-}
-
-// -- callee helpers --
-
-// analyzeAssignment looks at the the given stack, and analyzes certain
-// attributes of the innermost expression.
-//
-// In all cases we 'fail closed' when we cannot detect (or for simplicity
-// choose not to detect) the condition in question, meaning we err on the side
-// of the more restrictive rule. This is noted for each result below.
-//
-//   - assignable reports whether the expression is used in a position where
-//     assignability rules apply, such as in an actual assignment, as call
-//     argument, or in a send to a channel. Defaults to 'false'. If assignable
-//     is false, the other two results are irrelevant.
-//   - ifaceAssign reports whether that assignment is to an interface type.
-//     This is important as we want to preserve the concrete type in that
-//     assignment. Defaults to 'true'. Notably, if the assigned type is a type
-//     parameter, we assume that it could have interface type.
-//   - affectsInference is (somewhat vaguely) defined as whether or not the
-//     type of the operand may affect the type of the surrounding syntax,
-//     through type inference. It is infeasible to completely reverse engineer
-//     type inference, so we over approximate: if the expression is an argument
-//     to a call to a generic function (but not method!) that uses type
-//     parameters, assume that unification of that argument may affect the
-//     inferred types.
-func analyzeAssignment(info *types.Info, stack []ast.Node) (assignable, ifaceAssign, affectsInference bool) {
-	remaining, parent, expr := exprContext(stack)
-	if parent == nil {
-		return false, false, false
-	}
-
-	// TODO(golang/go#70638): simplify when types.Info records implicit conversions.
-
-	// Types do not need to match for assignment to a variable.
-	if assign, ok := parent.(*ast.AssignStmt); ok {
-		for i, v := range assign.Rhs {
-			if v == expr {
-				if i >= len(assign.Lhs) {
-					return false, false, false // ill typed
-				}
-				// Check to see if the assignment is to an interface type.
-				if i < len(assign.Lhs) {
-					// TODO: We could handle spread calls here, but in current usage expr
-					// is an ident.
-					if id, _ := assign.Lhs[i].(*ast.Ident); id != nil && info.Defs[id] != nil {
-						// Types must match for a defining identifier in a short variable
-						// declaration.
-						return false, false, false
-					}
-					// In all other cases, types should be known.
-					typ := info.TypeOf(assign.Lhs[i])
-					return true, typ == nil || types.IsInterface(typ), false
-				}
-				// Default:
-				return assign.Tok == token.ASSIGN, true, false
-			}
-		}
-	}
-
-	// Types do not need to match for an initializer with known type.
-	if spec, ok := parent.(*ast.ValueSpec); ok && spec.Type != nil {
-		for _, v := range spec.Values {
-			if v == expr {
+		for i := range sig.Results().Len() {
+			results = append(results, newParamInfo(sig.Results().At(i), true))
+		}
+	}
+
+	// Search function body for operations &x, x.f(), and x = y
+	// where x is a parameter, and record it.
+	escape(info, decl, func(v *types.Var, escapes bool) {
+		if info := paramInfos[v]; info != nil {
+			if escapes {
+				info.Escapes = true
+			} else {
+				info.Assigned = true
+			}
+		}
+	})
+
+	// Record locations of all references to parameters.
+	// And record the set of intervening definitions for each parameter.
+	//
+	// TODO(adonovan): combine this traversal with the one that computes
+	// FreeRefs. The tricky part is that calleefx needs this one first.
+	fieldObjs := fieldObjs(sig)
+	var stack []ast.Node
+	stack = append(stack, decl.Type) // for scope of function itself
+	ast.Inspect(decl.Body, func(n ast.Node) bool {
+		if n != nil {
+			stack = append(stack, n) // push
+		} else {
+			stack = stack[:len(stack)-1] // pop
+		}
+
+		if id, ok := n.(*ast.Ident); ok {
+			if v, ok := info.Uses[id].(*types.Var); ok {
+				if pinfo, ok := paramInfos[v]; ok {
+					// Record ref information, and any intervening (shadowing) names.
+					//
+					// If the parameter v has an interface type, and the reference id
+					// appears in a context where assignability rules apply, there may be
+					// an implicit interface-to-interface widening. In that case it is
+					// not necessary to insert an explicit conversion from the argument
+					// to the parameter's type.
+					//
+					// Contrapositively, if param is not an interface type, then the
+					// assignment may lose type information, for example in the case that
+					// the substituted expression is an untyped constant or unnamed type.
+					assignable, ifaceAssign, affectsInference := analyzeAssignment(info, stack)
+					ref := refInfo{
+						Offset:             int(n.Pos() - decl.Pos()),
+						Assignable:         assignable,
+						IfaceAssignment:    ifaceAssign,
+						AffectsInference:   affectsInference,
+						IsSelectionOperand: isSelectionOperand(stack),
+					}
+					pinfo.Refs = append(pinfo.Refs, ref)
+					pinfo.Shadow = pinfo.Shadow.add(info, fieldObjs, pinfo.Name, stack)
+				}
+			}
+		}
+		return true
+	})
+
+	// Compute subset and order of parameters that are strictly evaluated.
+	// (Depends on Refs computed above.)
+	effects = calleefx(info, decl.Body, paramInfos)
+	logf("effects list = %v", effects)
+
+	falcon := falcon(logf, fset, paramInfos, info, decl)
+
+	return params, results, effects, falcon
+}
+
+// -- callee helpers --
+
+// analyzeAssignment looks at the the given stack, and analyzes certain
+// attributes of the innermost expression.
+//
+// In all cases we 'fail closed' when we cannot detect (or for simplicity
+// choose not to detect) the condition in question, meaning we err on the side
+// of the more restrictive rule. This is noted for each result below.
+//
+//   - assignable reports whether the expression is used in a position where
+//     assignability rules apply, such as in an actual assignment, as call
+//     argument, or in a send to a channel. Defaults to 'false'. If assignable
+//     is false, the other two results are irrelevant.
+//   - ifaceAssign reports whether that assignment is to an interface type.
+//     This is important as we want to preserve the concrete type in that
+//     assignment. Defaults to 'true'. Notably, if the assigned type is a type
+//     parameter, we assume that it could have interface type.
+//   - affectsInference is (somewhat vaguely) defined as whether or not the
+//     type of the operand may affect the type of the surrounding syntax,
+//     through type inference. It is infeasible to completely reverse engineer
+//     type inference, so we over approximate: if the expression is an argument
+//     to a call to a generic function (but not method!) that uses type
+//     parameters, assume that unification of that argument may affect the
+//     inferred types.
+func analyzeAssignment(info *types.Info, stack []ast.Node) (assignable, ifaceAssign, affectsInference bool) {
+	remaining, parent, expr := exprContext(stack)
+	if parent == nil {
+		return false, false, false
+	}
+
+	// TODO(golang/go#70638): simplify when types.Info records implicit conversions.
+
+	// Types do not need to match for assignment to a variable.
+	if assign, ok := parent.(*ast.AssignStmt); ok {
+		for i, v := range assign.Rhs {
+			if v == expr {
+				if i >= len(assign.Lhs) {
+					return false, false, false // ill typed
+				}
+				// Check to see if the assignment is to an interface type.
+				if i < len(assign.Lhs) {
+					// TODO: We could handle spread calls here, but in current usage expr
+					// is an ident.
+					if id, _ := assign.Lhs[i].(*ast.Ident); id != nil && info.Defs[id] != nil {
+						// Types must match for a defining identifier in a short variable
+						// declaration.
+						return false, false, false
+					}
+					// In all other cases, types should be known.
+					typ := info.TypeOf(assign.Lhs[i])
+					return true, typ == nil || types.IsInterface(typ), false
+				}
+				// Default:
+				return assign.Tok == token.ASSIGN, true, false
+			}
+		}
+	}
+
+	// Types do not need to match for an initializer with known type.
+	if spec, ok := parent.(*ast.ValueSpec); ok && spec.Type != nil {
+		if slices.Contains(spec.Values, expr) {
-				typ := info.TypeOf(spec.Type)
+			typ := info.TypeOf(spec.Type)
-				return true, typ == nil || types.IsInterface(typ), false
-			}
+			return true, typ == nil || types.IsInterface(typ), false
 		}
 	}
 
@@ -616,7 +614,7 @@
 				return true, types.IsInterface(under.Elem()), false
 			case *types.Struct: // Struct{k: expr}
 				if id, _ := kv.Key.(*ast.Ident); id != nil {
-					for fi := 0; fi < under.NumFields(); fi++ {
+					for fi := range under.NumFields() {
 						field := under.Field(fi)
 						if info.Uses[id] == field {
 							return true, types.IsInterface(field.Type()), false
@gopherbot gopherbot added the Tools This label describes issues relating to any tools in the x/tools repository. label Feb 10, 2025
@gopherbot gopherbot added this to the Unreleased milestone Feb 10, 2025
@adonovan adonovan added Refactoring Issues related to refactoring tools and removed Tools This label describes issues relating to any tools in the x/tools repository. labels Feb 10, 2025
@gabyhelp gabyhelp added the ToolProposal Issues describing a requested change to a Go tool or command-line program. label Feb 10, 2025
@pjweinb
Copy link

pjweinb commented Feb 11, 2025

Is there a rush? We could add telemetry to see how often various sizes show up in gopls.

@adonovan
Copy link
Member Author

Is there a rush?

No, it's not urgent, but the gopls 18 release notes will invite people to use modernize -fix, so we expect this logic to get exercised a lot in the near future.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/652195 mentions this issue: go/analysis: use myers diff when testing SuggestedFixes

@findleyr
Copy link
Member

What other diff libraries do is this: start with line diffs, and then narrow in to subline diffs.

Since we already have the algorithm (and it's easy to move to generics such that it can operate on lines: https://go.dev/cl/652018), can we just do the same?

For tests, we can just use line diffs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactoring Issues related to refactoring tools ToolProposal Issues describing a requested change to a Go tool or command-line program.
Projects
None yet
Development

No branches or pull requests

5 participants